3D Facial Recognition: 2012

Wednesday, June 6, 2012

End of the Road

Hello all, this is the final blog post for the project.
Thank you for following this blog.

The Best Results first

Here are the extracted control points on top of the manually selected ones.

yellow - my extraction approach

light blue - manually selected points

You might notice that there is a fundamental difference in the control points. I am using the mouth corners, and no nose base.

After ignoring the mouth corners and nose base, I am able to achieve the following performance

Subject M1 4th surprise face VS 43 subjects( 13 M, 30 F)

Just to reiterate, this the performance I was achieving using the manually selected control points.

43 subjects( 13 M, 30 F)

A probem emerges:

for some images, the 3d camera was able to capture some of the slope inside the nostril. This causes the steepest edge detected to lie inside the nose as opposed to on the actual control point.

Other Trials

Here are some of the not so good results:

Here are some better results

This is the accuracy of the system I was trying to achieve. I hoped that by using 3-dimensional data, I would have higher accuracy simply because I had higher resolution. Although the paper I followed does claim that the previous statement is true, I was unable to reproduce such stunningly high results using my more artistic approach. I think if I would have been more sophisticated with the nose width points, the creativity of the eye width control points would have been just as accurate as needed.

Thank you.

Wednesday, May 30, 2012

Control Point Extraction

The Final Control Points:

Control Points were extracted in this order:

The order is important because the accuracy of points 3-5 is dependent on point 2. All points are dependent of point 1.

Nose Tip
Nose Width Points (Left & Right)
Inner Eye Corner (Left & Right)
Outer Eye Corner (Left & Right)
Mouth Corners (Left & Right)

The Control Points were extracted using different methods:

Nose Tip - The greatest z-axis value

Nose Width - A search area is defined where the Control Point is expected to be. The Control Point is determined by finding the steepest curve.

InnerEye - A search area is defined directly above the Nose Width Point. The point with the lowest Z-Coordinate is determined to be the Control Point

OuterEye - The distance between the two InnerEye points is added to each control Point to determine the OuterEye point.

MouthCorner - Half the distance between from the InnerEye to the OuterEye is added to the InnerEye. A straight line is drawn downward in the y-direction to define a new search area. Within this region, the point with the lowest z-value is estimated to be the final control point.

Wednesday, May 16, 2012

New Approaches to Normalization

The first thing we did was to change our normalization algorithm. Last week, we translated, rotated, scaled, and sheared each face so as to align them all on top of another. This week, we undid the shearing. Instead,

We computed the SVD of the Affine Matrix ( A = U * S * V')
We found the estimated rigid rotation matrix R = U * V' (excluding S, which is the scaling component of equation).

A. Subject: Male #1, Expression: Surprised, Sample #1 - BLACK

B. Subject: Male #1, Expression: Surprised, Sample #4 - BLUE

(B is transformed using R and translation to A)

Here are the results after running the Matching against the new normalization methods.

With 23 subjects or less, we achieved perfect sensitivity vs specificity.
As we increased the number of subjects into the system, we observed more error until we reached a threshold at 43 subjects where we no longer observed a loss in precision as we added more subjects.
There are more False Negatives than False Positives.

Results of matching the new normalization:

The data consists of 13 males, 20 females and 13 males, 30 females respectively

33 Subjects (13 M 20 F)

43 Subjects (13 M 30 F)

Wednesday, May 9, 2012

Alignment Issue Update

Last week, we present some statistics showing how well fiducial points were matching between samples of the same subjects for the same facial expressions (4 samples only, using a simplified L.O.O). However, after being informed that our fiducial points may not be aligned we were advised to look into this more.

Update:
It turns out the fiducial points are not aligned. Here is a image showing 4 examples of the same subject, same expression, without alignment and with alignment of 8 fiducial points that correspond to raw point clouds (using 3d affine transforms, mapping FP #1, FP #3, FP #4 -> FP #2):

Black: FP #2
Red: FP #1
Green: FP #3
Blue: FP #4
(8 Fiducial Points, Female subject: #1, Expression (Angry): #1, Binghamton Database)

Below is an image of the 2 sets of 83 control points corresponding a point cloud face that was preprocessed before and after they were affine transformed (but of the same subject, same expression, same number of samples):

Red: FP #2
Blue: FP #1
Purple: FP #3
Teal: FP #4

We also re-computed statistics on all the "aligned" points for a 10 females and 10 males where each male/female has 4 samples of 7 expressions, however, we did not have enough time to conjure up presentable data.

Wednesday, May 2, 2012

Control Point Matching for a Single Subject, Single Expression

Subject 43

Happiness:
4th image of happiness compared against:

happy images 1-3 of same subject
happy images 1-3 of all other subjects

k=1:

subject 43 score 2

subject 10 score 3

subject 11 score 1

subject 12 score 2

subject 13 score 0

subject 14 score 10

subject 15 score 3

subject 16 score 8

subject 17 score 0

subject 18 score 0

subject 19 score 6

subject 20 score 1

subject 21 score 0

subject 22 score 2

subject 23 score 9

subject 24 score 4

subject 25 score 1

subject 26 score 2

subject 27 score 1

subject 28 score 1

subject 29 score 3

subject 30 score 0

subject 31 score 3

subject 02 score 4

subject 03 score 5

subject 34 score 1

subject 01 score 4

subject 06 score 9

subject 07 score 3

subject 04 score 1

subject 05 score 2

subject 40 score 0

subject 41 score 2

subject 08 score 6

subject 09 score 1

subject 44 score 9

subject 45 score 0

subject 46 score 3

subject 47 score 5

subject 48 score 0

subject 49 score 3

subject 50 score 4

subject 51 score 4

subject 52 score 1

subject 53 score 6

subject 54 score 7

subject 55 score 5

subject 56 score 1

subject 32 score 3

subject 33 score 2

subject 35 score 4

subject 36 score 6

subject 37 score 4

subject 38 score 2

subject 39 score 1

subject 42 score 3

k = 2:

subject 43 score 0
subject 10 score 1
subject 11 score 0
subject 12 score 0
subject 13 score 0
subject 14 score 8
subject 15 score 3
subject 16 score 3
subject 17 score 0
subject 18 score 0
subject 19 score 5
subject 20 score 0
subject 21 score 0
subject 22 score 1
subject 23 score 5
subject 24 score 1
subject 25 score 1
subject 26 score 2
subject 27 score 0
subject 28 score 1
subject 29 score 1
subject 30 score 0
subject 31 score 0
subject 02 score 2
subject 03 score 2
subject 34 score 0
subject 01 score 1
subject 06 score 4
subject 07 score 1
subject 04 score 1
subject 05 score 1
subject 40 score 0
subject 41 score 0
subject 08 score 3
subject 09 score 0
subject 44 score 7
subject 45 score 0
subject 46 score 0
subject 47 score 2
subject 48 score 0
subject 49 score 0
subject 50 score 2
subject 51 score 2
subject 52 score 0
subject 53 score 4
subject 54 score 5
subject 55 score 5
subject 56 score 0
subject 32 score 0
subject 33 score 2
subject 35 score 2
subject 36 score 5
subject 37 score 1
subject 38 score 1
subject 39 score 1
subject 42 score 1

k = 3:

subject 43 score 0
subject 10 score 0
subject 11 score 0
subject 12 score 0
subject 13 score 0
subject 14 score 6
subject 15 score 2
subject 16 score 2
subject 17 score 0
subject 18 score 0
subject 19 score 5
subject 20 score 0
subject 21 score 0
subject 22 score 0
subject 23 score 3
subject 24 score 0
subject 25 score 1
subject 26 score 2
subject 27 score 0
subject 28 score 1
subject 29 score 0
subject 30 score 0
subject 31 score 0
subject 02 score 2
subject 03 score 1
subject 34 score 0
subject 01 score 0
subject 06 score 1
subject 07 score 0
subject 04 score 0
subject 05 score 1
subject 40 score 0
subject 41 score 0
subject 08 score 2
subject 09 score 0
subject 44 score 4
subject 45 score 0
subject 46 score 0
subject 47 score 0
subject 48 score 0
subject 49 score 0
subject 50 score 2
subject 51 score 1
subject 52 score 0
subject 53 score 3
subject 54 score 3
subject 55 score 2
subject 56 score 0
subject 32 score 0
subject 33 score 2
subject 35 score 1
subject 36 score 2
subject 37 score 1
subject 38 score 1
subject 39 score 0
subject 42 score 1
-----------------------------------------------------

Sadness:
4th image of sadness compared against:

sad images 1-3 of same subject
sad images 1-3 of all other subjects

k = 1:

subject 43 score 3
subject 10 score 2
subject 11 score 2
subject 12 score 7
subject 13 score 1
subject 14 score 9
subject 15 score 2
subject 16 score 1
subject 17 score 0
subject 18 score 0
subject 19 score 0
subject 20 score 2
subject 21 score 1
subject 22 score 2
subject 23 score 11
subject 24 score 5
subject 25 score 2
subject 26 score 1
subject 27 score 3
subject 28 score 2
subject 29 score 1
subject 30 score 1
subject 31 score 5
subject 02 score 2
subject 03 score 5
subject 34 score 0
subject 01 score 0
subject 06 score 6
subject 07 score 3
subject 04 score 3
subject 05 score 1
subject 40 score 1
subject 41 score 5
subject 08 score 3
subject 09 score 4
subject 44 score 6
subject 45 score 6
subject 46 score 3
subject 47 score 5
subject 48 score 1
subject 49 score 3
subject 50 score 4
subject 51 score 3
subject 52 score 0
subject 53 score 12
subject 54 score 7
subject 55 score 0
subject 56 score 2
subject 32 score 2
subject 33 score 6
subject 35 score 2
subject 36 score 2
subject 37 score 1
subject 38 score 0
subject 39 score 2
subject 42 score 1

k = 2:

subject 43 score 2
subject 10 score 9
subject 11 score 7
subject 12 score 1
subject 13 score 5
subject 14 score 7
subject 15 score 1
subject 16 score 7
subject 17 score 1
subject 18 score 7
subject 19 score 6
subject 20 score 3
subject 21 score 5
subject 22 score 2
subject 23 score 7
subject 24 score 1
subject 25 score 2
subject 26 score 7
subject 27 score 5
subject 28 score 1
subject 29 score 16
subject 30 score 5
subject 31 score 1
subject 02 score 10
subject 03 score 7
subject 34 score 4
subject 01 score 3
subject 06 score 9
subject 07 score 4
subject 04 score 3
subject 05 score 4
subject 40 score 1
subject 41 score 1
subject 08 score 8
subject 09 score 5
subject 44 score 1
subject 45 score 5
subject 46 score 6
subject 47 score 12
subject 48 score 5
subject 49 score 4
subject 50 score 7
subject 51 score 7
subject 52 score 6
subject 53 score 4
subject 54 score 5
subject 55 score 9
subject 56 score 8
subject 32 score 11
subject 33 score 11
subject 35 score 8
subject 36 score 5
subject 37 score 5
subject 38 score 11
subject 39 score 2
subject 42 score 0

-----------------------------------------------------

All Emotions:
4th image of Avg Face compared against:

avg images 1-3 of same subject
avg images 1-3 of all other subjects

k = 20

subject 43 score 2
subject 10 score 9
subject 11 score 7
subject 12 score 1
subject 13 score 5
subject 14 score 7
subject 15 score 1
subject 16 score 6
subject 17 score 1
subject 18 score 7
subject 19 score 5
subject 20 score 4
subject 21 score 5
subject 22 score 0
subject 23 score 4
subject 24 score 4
subject 25 score 1
subject 26 score 3
subject 27 score 1
subject 28 score 0
subject 29 score 11
subject 30 score 5
subject 31 score 2
subject 02 score 6
subject 03 score 9
subject 34 score 5
subject 01 score 3
subject 06 score 9
subject 07 score 4
subject 04 score 4
subject 05 score 4
subject 40 score 0
subject 41 score 1
subject 08 score 7
subject 09 score 4
subject 44 score 0
subject 45 score 2
subject 46 score 5
subject 47 score 6
subject 48 score 4
subject 49 score 4
subject 50 score 6
subject 51 score 5
subject 52 score 6
subject 53 score 4
subject 54 score 7
subject 55 score 6
subject 56 score 6
subject 32 score 4
subject 33 score 6
subject 35 score 8
subject 36 score 4
subject 37 score 4
subject 38 score 9
subject 39 score 0
subject 42 score 0

Monday, April 30, 2012

Control Point Matching: Different Approaches & Results

Proposed Algorithm1
Euclidean Distance:

Each Emotion has an average and Standard Deviation

We decided to take the set of feature points for a person's average face. A person's average face is the pointwise average of all their facial expressions (neutral, angry, disgust, fear, happiness, sadness, and surprise). Next we calculated the distance between a single subject's neutral expression, and their average expression. The average distance is 22,955. Then we calculated the distance between a single subject's neutral expression, and a random subject's neutral expression. The average distance is 23,426.

To narrow in on accuracy, we next tried Neutral vs Neutral.
Obviously for the same subject the distance was 0.
For one subject vs. a random subject, the distance was 5,553.

It's starting to look like facial expression plays a larger role in face detection than we thought.

Happy VS Happy:

sameSubject VS sameSubject: 0
subject1 VS randomSubject: 17,680.857
mean across 7 trials was 17,680.857
std deviation across 7 trials was 787.105

From here on, Euclidean Distance & Standard Deviation are about the same for any facial expression except Neutral.

Angry VS Angry:

subject1 VS randomSubject: 17,993

Disgust VS Disgust:

subject1 VS randomSubject: 18,418

Fear VS Fear:

subject1 VS randomSubject: 18,673

Sadness VS Sadness:

subject1 VS randomSubject: 18,816

Surprise VS Surprise:

subject1 VS randomSubject: 18,096

Neutral VS Neutral (only one image per subject):

subject1 VS randomSubject: 5,979
Here, the mean is: 5,655.85 and the StdD:179.556

(Maybe also compare some results comparing DIFFERENT emotions for SAME subjects)

(Maybe also compare some results comparing DIFFERENT emotions for DIFFERENT subjects)

Proposed Algorithm2
Standard Deviation:

Each Control Point has an average and standard deviation

We also looked at combining the average of each control point with the standard deviation, and creating a lower and upper bound based on [mean-(k)stdD, mean+(k)stdD] where k is a variable for us to adjust in order to make a more or less forgiving system.

For subject 1:
Control Point 1:

mean is [-1.629351072, -43.134416, 49.60222000000001]
stdD is [ 0.4779777856, 4.3027568, 2.2643159999999978]

Control Point 2:

mean is [-30.102224, -39.13870000000001, 42.339132]
stdD is [0.5331751999999994, 4.513219999999998, 2.0564536]

Control Point 3:

mean is [33.45903199999999, -42.588708, 49.45645599999999]
stdD is [0.7318063999999985, 4.601218400000002, 2.6821088000000017]

Control Point 4:

mean is [61.85497199999999, -36.155116, 42.42556799999999]
stdD is [0.48691439999999775, 4.5829768, 2.818026400000002]

Control Point 5:

mean is [5.943795720000001, -77.130488, 31.952824000000007]
stdD is [1.002333744, 4.269662400000001, 2.6875951999999983]

Control Point 6:

mean is [30.396192000000006, -76.80582400000002, 31.9487]
stdD is [0.24202159999999892, 3.7177551999999965, 3.0918799999999997]

Control Point 7:

mean is [18.4926776, -85.00685599999998, 55.480132]
stdD is [0.7604155200000001, 4.755828800000003, 2.8384736000000004]

Control Point 8:

mean is [18.6188916, -85.170976, 37.27458399999998]
stdD is [0.45469831999999993, 4.861604800000001, 2.827723200000004]

The idea now is to iterate over the control points of an incoming face, and check if it falls in the range of [mean +- (k)stdD] for each control point of each subject. In order to separate our training data from our testing data, we are thinking of using half of a subjects emotions to test, and the other half of the emotions to train. Hopefully this will avoid the problem of splitting our data into two similar halves.

In order to test, maybe we can test all males VS all females expecting to see a lot of failures

Originally we planned our Training Data VS Testing Data to be half the females & half the males VS the other two halves. As of right now, our Training Data VS Testing Data is Female VS Male. Maybe different advantages of both. Could we use 13 images to train and 13 to test?

Wednesday, April 25, 2012

Week 4

Normalization of training data complete. Ready to move onto control point extraction. Using homogeneous coordinates and rotation matrices to manipulate our data.

Some difficulties included finding the nose tip. This was not as easy as our paper suggested it would be. Solved this problem by rotating by a hard-coded amount right now in order to avoid locating the nose tip. Scaling doesn't seem to be an issue at the moment.

Looking forward, we will need to implement orthogonal distance fitting (plane fitting) and min max filtering for region selection. We will need to divide the face into rectangular grids in order to perform plane fitting. Some methods we are looking at include Gauss-Newton algorithm, Steepest Gradient Descent, Levenberg-Marquardt procedure, and simplex method.

Monday, April 16, 2012

Progress 4/16/2012

Clear and Distinct Goal

Output true/false determination that given image is personX
Training data contains 100 faces (56 female, 44 male, 25 facial expressions each), so we aren't afraid of over fitting.

Data Aquisition

Cheapest, cleanest option seems to be a kinnect. This is an end of the quarter goal, but knowing what we know now, we have to keep in mind how our current algorithm seems to become more and more file format dependent.

Progress

Implemented pre-processing as described in:
Automatic 3D Facial Feature Extraction Algorithm
- Translation
-Rotation about the Y and X axes. (Roll & Pitch, no Yaw)
paper suggests using homogenous coordinates and 4x4 rotaion matrices. We implemented using simple trigonometric calculations.

Hurdles

The vrml file format can be very restrictive. It's not as easy as changing the 3D point cloud. We will also have to figure out a clever way to change the point cloud relationships in the vrml.
How to get from kinnect -> pointcloud (far off goal)
Why the c++? Why not implement the algorithm entirely in matlab?

Monday, April 9, 2012

Week 1 Progress

Week 1 Goals

Getting OpenCV libraries compiling in VS 2010
Make more sense of the Binghamton database data and decide on which database to begin processing (Binghamton vs. UT Austin)
Begin to process the Binghamton database data
Start implementing the pre-processing step of the 3D Facial recognition algorithm

Week 1 Goals Accomplished

OpenCV compiling and running in VS 2010 (not using CMake)
Decided to begin processing on Binghamton database first
Able to read in and display BMP images using OpenCV, able to view VRML files with VRML viewers (for viewing point cloud data)

Road Blocks / Asides

Pre-processing step depends on reading in point cloud information which lies within VRML file. Realized we need to parse the VRML files (in progress)
Had some trouble viewing the VRML files
OpenCV and CMake

Week 2 Goals

Build VRML parser to extract point cloud data
Begin pre-processing step of our algorithm (normalize face to postive z-axis)

Here is an screenshot of viewing a raw face in FreeWRL:

Thursday, April 5, 2012

Special Thanks

We gratefully acknowledge Dr. Lijun Yin, Department of Computer Science, and The State University of New York at Binghamton for the BU-3DFE dataset and Prof. Al Bovik, Director of LIVE at The University of Texas at Austin, for use of the Texas 3-D Face Recognition Database. These datasets allow us to begin the implementation of the 3D facial recognition algorithm immediately! Thank you again!

Wednesday, April 4, 2012

Title & Abstract

3D Facial Recognition: A Security Authentication Application

The application of computer vision has been researched thoroughly over the past decade. Facial recognition has been the focus of image based security in order to identify a criminal in a crowd. However, these same techniques can be applied to identify authorized users in a controlled environment. For example, allowing only the CEO of a company into a private conference room. Applying the techniques used in facial recognition could replace more invasive methods such as the use of finger printing, RFID, magnetic bar codes, physical keys, etc. Less intrusive methods such as 2D facial recognition have been thoroughly researched, but can be easily subverted or compromised. 3D Face Recognition has shown promising results [3] that suggest accuracy higher than that of 2D facial recognition.
The goal of our project is to implement a relatively straightforward 3D face recognition system based on a proposed algorithm [3] that will accurately authenticate a single user. We wish to explore the tradeoff between accuracy and speed as well as different combinations of existing algorithms.

The following is a link to our proposal (best viewed with MS Word): http://dl.dropbox.com/u/40368164/daniel%26daniel_CSE155_proposal.docxm

The Skinny

Proposed Algorithm

Advanced Algorithm

Monday, April 2, 2012

The joys of C++

So the first thing we did was this:
- set up VMware fusion on our personal machines.
- configured Visual Studio 2010 on our VMs
- deleted our training data
- freaked out
- discussed different image formats for facial recognition & settled on png for lossless quality.
- began loading images using msdn c++, and decided to use openCV.
- installed openCV