Saturday, 26 May 2012


During the week i have been looking for ways to implement stream modification and came across a library called GStreamer. It contains functions that allow for direct manipulation of a webcam stream even if the stream is currently being used by a program such as skype, which means that it does not take over the webcam stream like OpenCV. However OpenCV will still most likely be used due to the fact that facial recognition still needs to apply for the filters for GStreamer to use.

Monday, 21 May 2012

Recent Activities and Progression

I realise that that blog post is over due but for good reason.

Firstly my project has been changed from a Waterfall Development Project to a Agile Development Project. This change was done because of the way that the code has been made. Me and William are currently sharing code and splitting the parts appropriately. While i have done more research than coding i am currently doing more coding than research because of my research in finding how my design can be implemented.

In my research i have found that to do the my implementation of being a go between between the video conferencing program and the kinect (or webcam), i need to implement a filter system. The filter uses the Windows SDK DirectDraw methods to hook into the webcam stream and modify the stream on the fly. This is exactly how i wanted to do this project but have had quite a few speed bumps along the way. First of which was finding out how to modify a webcam stream, since OpenCV is unable to do this an alternate route had to be found. This is where i found the filter method, while very promising there are very few open source examples of this type of implementation, not to mention that installing the Windows SDK was a more than a little hassle but will not go into that in the blog.

Currently i am re-evaluating my design with the filter method and find out how to use the Direct Draw methods in the SDK to be able to change the webcam streams, and doing more coding.

Tuesday, 1 May 2012

Facial Recognition Implementation

With the previous posts on the 2 different ways of facial tracking there arises 3 different ways to implement facial tracking into this project

  1. Using AAM Tracking for Facial Recognition
  2. Using HAAR for Facial Recognition
  3. Using Both AAM and HAAR

The third option arose from my research on my discovery of this paper "Fast AAM Face Recognition with Combined Haar Classi ers and Skin Color Segmentation", written by various authors. It explains that since the AAM tracking is quite sensitive to the initial starting position of the model and image, it is possible to using HAAR classifiers to give the starting positions of the model and image, which would then filter to the AAM algorithm.

The implementation will be discussed with William Qi before any implementation is done due to how we are co-operating with the code.


Active Appearance Model

Active Appearance Model (AAM) is a algorithm that uses a statistical model. This model is a model of the shape and grey-level appearance of an object. During the training phase of the algorithm, we begin to learn the relationship between model parameter displacements and the residual errors induced between a training image and a synthesised model. This algorithm is able to give a good overall match in just a few iterations even with poor starting estimates (to a certain degree).

However AAM is very sensitive to the initial matching position of the model and the image, and there could be problems with the computational expense of the algorithm and its accuracy without a good starting place.


T.F. Cootes, G.J. Edwards, C.J. Taylor. Active Appearance Models. 1998. Proc European Conference on Computer Vision.

HAAR Object Detection for Facial Detection

This detection is also called "Viola-Jones object detection framework", named after Paul Viola and Michael Jones, which uses Haar Features (which derive from HAAR Wavelets) to detect objects. Haar-like features are features represented as digital images such as lines and edges that are used in object recognition. The Haar classifier uses these digital images to detect objects by viewing the change in contrast values between adjacent rectangular groups of pixels. These changes in contrast determine relative light and dark areas. The reason why these feature are used is because they are easily scaled by increasing or decreasing the size of the pixel group being analysed.

Figure 1. Haar Features

Using the Viola-Jones framework, the features that are used involves the sums of the image pixels within the rectangular areas. While there are other classifiers that use Haar such as the Haar Basis Function, the Viola-Jones uses more than one rectangular area making it more complex and therefore able to detect more facial features

This Viola-Jones framework is the method included with the current OpenCV libraries for facial detection.


Michael Jones, Paul Viola. Robust Real-time Object Detection. 2001. Second International Workshop on Statistical and Computational Theories of Vision - Modeling, Learning, Computing and Sampling.

Dr John Fernandez, Phillip Ian Wilson. Facial Feature Detection Using HAAR Classifiers. 2006. JCSC 21, 4.