3D Facial Recognition: April 2012

Monday, April 30, 2012

Control Point Matching: Different Approaches & Results

Proposed Algorithm1
Euclidean Distance:

Each Emotion has an average and Standard Deviation

We decided to take the set of feature points for a person's average face. A person's average face is the pointwise average of all their facial expressions (neutral, angry, disgust, fear, happiness, sadness, and surprise). Next we calculated the distance between a single subject's neutral expression, and their average expression. The average distance is 22,955. Then we calculated the distance between a single subject's neutral expression, and a random subject's neutral expression. The average distance is 23,426.

To narrow in on accuracy, we next tried Neutral vs Neutral.
Obviously for the same subject the distance was 0.
For one subject vs. a random subject, the distance was 5,553.

It's starting to look like facial expression plays a larger role in face detection than we thought.

Happy VS Happy:

sameSubject VS sameSubject: 0
subject1 VS randomSubject: 17,680.857
mean across 7 trials was 17,680.857
std deviation across 7 trials was 787.105

From here on, Euclidean Distance & Standard Deviation are about the same for any facial expression except Neutral.

Angry VS Angry:

subject1 VS randomSubject: 17,993

Disgust VS Disgust:

subject1 VS randomSubject: 18,418

Fear VS Fear:

subject1 VS randomSubject: 18,673

Sadness VS Sadness:

subject1 VS randomSubject: 18,816

Surprise VS Surprise:

subject1 VS randomSubject: 18,096

Neutral VS Neutral (only one image per subject):

subject1 VS randomSubject: 5,979
Here, the mean is: 5,655.85 and the StdD:179.556

(Maybe also compare some results comparing DIFFERENT emotions for SAME subjects)

(Maybe also compare some results comparing DIFFERENT emotions for DIFFERENT subjects)

Proposed Algorithm2
Standard Deviation:

Each Control Point has an average and standard deviation

We also looked at combining the average of each control point with the standard deviation, and creating a lower and upper bound based on [mean-(k)stdD, mean+(k)stdD] where k is a variable for us to adjust in order to make a more or less forgiving system.

For subject 1:
Control Point 1:

mean is [-1.629351072, -43.134416, 49.60222000000001]
stdD is [ 0.4779777856, 4.3027568, 2.2643159999999978]

Control Point 2:

mean is [-30.102224, -39.13870000000001, 42.339132]
stdD is [0.5331751999999994, 4.513219999999998, 2.0564536]

Control Point 3:

mean is [33.45903199999999, -42.588708, 49.45645599999999]
stdD is [0.7318063999999985, 4.601218400000002, 2.6821088000000017]

Control Point 4:

mean is [61.85497199999999, -36.155116, 42.42556799999999]
stdD is [0.48691439999999775, 4.5829768, 2.818026400000002]

Control Point 5:

mean is [5.943795720000001, -77.130488, 31.952824000000007]
stdD is [1.002333744, 4.269662400000001, 2.6875951999999983]

Control Point 6:

mean is [30.396192000000006, -76.80582400000002, 31.9487]
stdD is [0.24202159999999892, 3.7177551999999965, 3.0918799999999997]

Control Point 7:

mean is [18.4926776, -85.00685599999998, 55.480132]
stdD is [0.7604155200000001, 4.755828800000003, 2.8384736000000004]

Control Point 8:

mean is [18.6188916, -85.170976, 37.27458399999998]
stdD is [0.45469831999999993, 4.861604800000001, 2.827723200000004]

The idea now is to iterate over the control points of an incoming face, and check if it falls in the range of [mean +- (k)stdD] for each control point of each subject. In order to separate our training data from our testing data, we are thinking of using half of a subjects emotions to test, and the other half of the emotions to train. Hopefully this will avoid the problem of splitting our data into two similar halves.

In order to test, maybe we can test all males VS all females expecting to see a lot of failures

Originally we planned our Training Data VS Testing Data to be half the females & half the males VS the other two halves. As of right now, our Training Data VS Testing Data is Female VS Male. Maybe different advantages of both. Could we use 13 images to train and 13 to test?

Wednesday, April 25, 2012

Week 4

Normalization of training data complete. Ready to move onto control point extraction. Using homogeneous coordinates and rotation matrices to manipulate our data.

Some difficulties included finding the nose tip. This was not as easy as our paper suggested it would be. Solved this problem by rotating by a hard-coded amount right now in order to avoid locating the nose tip. Scaling doesn't seem to be an issue at the moment.

Looking forward, we will need to implement orthogonal distance fitting (plane fitting) and min max filtering for region selection. We will need to divide the face into rectangular grids in order to perform plane fitting. Some methods we are looking at include Gauss-Newton algorithm, Steepest Gradient Descent, Levenberg-Marquardt procedure, and simplex method.

Monday, April 16, 2012

Progress 4/16/2012

Clear and Distinct Goal

Output true/false determination that given image is personX
Training data contains 100 faces (56 female, 44 male, 25 facial expressions each), so we aren't afraid of over fitting.

Data Aquisition

Cheapest, cleanest option seems to be a kinnect. This is an end of the quarter goal, but knowing what we know now, we have to keep in mind how our current algorithm seems to become more and more file format dependent.

Progress

Implemented pre-processing as described in:
Automatic 3D Facial Feature Extraction Algorithm
- Translation
-Rotation about the Y and X axes. (Roll & Pitch, no Yaw)
paper suggests using homogenous coordinates and 4x4 rotaion matrices. We implemented using simple trigonometric calculations.

Hurdles

The vrml file format can be very restrictive. It's not as easy as changing the 3D point cloud. We will also have to figure out a clever way to change the point cloud relationships in the vrml.
How to get from kinnect -> pointcloud (far off goal)
Why the c++? Why not implement the algorithm entirely in matlab?

Monday, April 9, 2012

Week 1 Progress

Week 1 Goals

Getting OpenCV libraries compiling in VS 2010
Make more sense of the Binghamton database data and decide on which database to begin processing (Binghamton vs. UT Austin)
Begin to process the Binghamton database data
Start implementing the pre-processing step of the 3D Facial recognition algorithm

Week 1 Goals Accomplished

OpenCV compiling and running in VS 2010 (not using CMake)
Decided to begin processing on Binghamton database first
Able to read in and display BMP images using OpenCV, able to view VRML files with VRML viewers (for viewing point cloud data)

Road Blocks / Asides

Pre-processing step depends on reading in point cloud information which lies within VRML file. Realized we need to parse the VRML files (in progress)
Had some trouble viewing the VRML files
OpenCV and CMake

Week 2 Goals

Build VRML parser to extract point cloud data
Begin pre-processing step of our algorithm (normalize face to postive z-axis)

Here is an screenshot of viewing a raw face in FreeWRL:

Thursday, April 5, 2012

Special Thanks

We gratefully acknowledge Dr. Lijun Yin, Department of Computer Science, and The State University of New York at Binghamton for the BU-3DFE dataset and Prof. Al Bovik, Director of LIVE at The University of Texas at Austin, for use of the Texas 3-D Face Recognition Database. These datasets allow us to begin the implementation of the 3D facial recognition algorithm immediately! Thank you again!

Wednesday, April 4, 2012

Title & Abstract

3D Facial Recognition: A Security Authentication Application

The application of computer vision has been researched thoroughly over the past decade. Facial recognition has been the focus of image based security in order to identify a criminal in a crowd. However, these same techniques can be applied to identify authorized users in a controlled environment. For example, allowing only the CEO of a company into a private conference room. Applying the techniques used in facial recognition could replace more invasive methods such as the use of finger printing, RFID, magnetic bar codes, physical keys, etc. Less intrusive methods such as 2D facial recognition have been thoroughly researched, but can be easily subverted or compromised. 3D Face Recognition has shown promising results [3] that suggest accuracy higher than that of 2D facial recognition.
The goal of our project is to implement a relatively straightforward 3D face recognition system based on a proposed algorithm [3] that will accurately authenticate a single user. We wish to explore the tradeoff between accuracy and speed as well as different combinations of existing algorithms.

The following is a link to our proposal (best viewed with MS Word): http://dl.dropbox.com/u/40368164/daniel%26daniel_CSE155_proposal.docxm

The Skinny

Proposed Algorithm

Advanced Algorithm

Monday, April 2, 2012

The joys of C++

So the first thing we did was this:
- set up VMware fusion on our personal machines.
- configured Visual Studio 2010 on our VMs
- deleted our training data
- freaked out
- discussed different image formats for facial recognition & settled on png for lossless quality.
- began loading images using msdn c++, and decided to use openCV.
- installed openCV