Public Summary Month 10/2012

We have been improving the facial tracker in order to make it more robust than the original version to sudden illumination changes. The improvements have been focused principally in the learning strategy of the facial template for the calculation of the update of the 3D facial model configuration. Its previous version was simply built with the mean and standard deviations of the pixel values in the warped facial image of the most recent frames. Currently, the learning process takes into account which frames can be considered as good enough to take them as a reference. The frontal-like views which do not have too many pixel outliers (i.e, those which could have been occluded by other objects, such hands or hair) are only considered with a more immediate learning time so that it can adapt to sudden illumination changes.


On the other hand, we have also defined a 3D facial model derived from Candide-3, with 14 DOFs (degrees of freedom) from which 12 (all except the XY position with respect to the screen) are related to facial AUs (Action Units) of the FACS (Facial Action Coding System): head forward/back (AU57/58), head up/down (AU53/54), head turn left/right (AU51/52), head tilt left/right (AU55/56), left eye closed (AU42/43/44/45), right eye closed (AU42/43/44/45), brow lowerer (AU4), outer left brow raiser (AU2), outer right brow raiser (AU2), mouth opener (AU10/26/27), lip stretcher (AU20) and lip corner depressor (AU13/15).


Regarding the recognition of dynamic gestures, during this period we have finished the C++ implementation of the ACA algorithm presented in the last report, which was originally coded in Matlab, obtaining the same segmentation results with the same parameters. Therefore, this implementation is ready to be integrated in the Kompaï robotic platform in subsequent stages of the project.


Besides, we have been reviewing coding systems for body expression, with a similar scope to that of the well-known FACS for facial expressions, and we have seen that the The Body Action and Posture Coding System (BAP) proposed by Dael et al. can be appropriate for our purpose. We consider establishing a relation between the body data tracked by our system and the BAP and FACS coding for fused expression recognition following the approach of the statistical tool of the Canonical Correlation Analysis (CCA). We will use the annotation tool ANVIL to create the ground truth for the recognition.