Face Detection through Deep Neural Network.


Face detection is still an undergoing Research topic ,and one of the most favorite computer vision problem in the industry. As of 2014 there are 245 million surveillance cameras being installed and in the era of the Big data , the amount  data fetched through the videos is very very high. The problem with the surveillance cameras , is the person can not permanently sit and keep monitoring  the video feed .Hence face monitoring or detection is a prior necessity along the way for increasing security . Interestingly due to lot of research keep flourishing in the area of Deep Learning and Machine Learning , accurate face detection to some extent is possible.Though face detection problem has been solved by classical Haar Cascade , LBP algorithms but still lot of research is going on to perfect the system.Here we will be talking about the Classical approaches and with time we discuss new CNN and MMOD techniques to solve the problem of the face detection.

Classical Approach:

Face Detection using Haar feature-based cascade classifiers is a very effective object detection method proposed by Paul Viola and Michael Jones in their paper, "Rapid Object Detection using a Boosted Cascade of Simple Features" in 2001.It is a machine learning approach where a cascaded function is trained from lot of positive and negative images. The algorithm initially requires a lot of Positive and Negative images to train the classifiers.The idea is to detect objects in different images.

The feature extraction is done by the Haar features as shown .These haar features are simply constitutional kernel and each feature is a single value obtained by subtracting sum of pixels under white rectangle from sum of pixels under black rectangle.In the four rectangle feature computes the difference between diagonal pairs of rectangles (Viola P. & Jones M.; 2001).
All the possible size and locations of the kernel are used to calculate features and they are huge number of features , even 24 x 24 kernel can produce 160000+ features.   

Algorithm Working: 

Viola Jones Haar cascade algorithm can be classified into 4 sub divisions:
  1. Haar Feature detection
  2. Integral Images
  3. AdaBoost training and
  4. cascade Classifiers
 1. Haar Feature:
All human faces share some common properties like "The eye region is darker than the upper-cheeks", "The nose bridge region is brighter than the eyes",these similarities may be matched using Haar feature and consequently detect weather the face is present or not in a particular frame or image.

The image on the left side shows a feature that looks similar to haar feature- "The eye region is darker than the upper-cheeks"
The image on left side shows a feature that looks similar to haar feature -"The nose bridge region is brighter than the eyes"


The  value of the rectangular regions is calculated by equation:
Σ (pixels in black area) - Σ (pixels in white area) 
But this computation is done by a different approach. Since the computation is done by using rectangular images the feature calculation can be done in a quick manner by using the approach of Integral image.

2. Integral Images 
A summed area table is a data structure and algorithm for quickly and efficiently generating the sum of values in a rectangular subset of a grid. In the image processing domain, it is also known as an integral image. The value in the integral image at any point (x , y) is the sum of all the pixels to the left and above (x , y) in original test image.
ii(x,y) = Σ i(x',y')
 3. Ada Boost training
The Haar cascade as said is a classification learning process requires a set of positive and negative images for training the set of the Haar features are selected by using Ada-boost for training the classifier. To increase the learning performance of the algorithm (which is sometime called as weak learner), the Ada-boost algorithm is used. Ad-boost is a boosting algorithm in Machine Learning domain.The boosting refers to the family of algorithm which converts the weak classifiers into strong classifiers.

A very simple explanation to Boosting is suppose we need to classify on the basis of height weather the person is man or woman . So we can estimate that people above height 5'8'' are men and rest are woman. It may be not a correct guess , may be you are 50-60 times wrong but still you will be right most of the times . This is a form of a weak classifier and Ada Boost focuses on making weak classifiers into strong classifiers.

The process of ‘Boosting’ works with the learning of single simple classifier "like height above 5'8'' is a single simple classifier" and rewriting the weight of the data where errors were made with higher weights. Afterwards a second simple classifier is learned on the weighted classifier, and the data is reweighted on the combination of 1st and 2nd classifier and so on until the final classifier is learned. Therefore, the final classifier is the combination of all previous n-classifiers.

 4. Cascade Classifiers:
In practice and even in practical scenario there is no such strong classifiers.Instead a series of such weak classifiers are used to train to form a cascade of classifiers.The simple classifiers comes earlier in the cascade and they can reject majority of regions or subwindows which are likely to have no face,while retaining the regions which have greater chance of having a face.Now the next set of classifiers will be running on the remaining regions which need more complex analysis,and this is where the later stages of cascade prove to be useful.


There are various other approaches like LBP local binary pattern used for face detection that works in a similar approach, but is much faster and thus is used in the development boards like Raspberry-Pi , Beagle Bone etc.  But as the power of computing and our understanding of Neural Network and Deep Neural Networks has increased ,our way to approach such kind of Computer Visions problems have also changed.Now we will talk much in details of Deep Learning approach to solve the classical Face Detection problem.

  

No comments: