jithin pradeep Cognitive Research Scientist | AI and Mixed reality Enthusiast

Review Notes - DeepFace- Closing the Gap to Human-Level Performance in Face Verification

Introduction

Face Alignment / Frontalization

The idea is to clear of variations within the images/faces so that every face appears to look straight into the camera(“frontalized”).

Aligmnet pipleine

2D alignment

3Dalignment

Deep CNN architecture

architecture

The CNN receives the frontalized face images(152x152, RGB).

Convolution-pooling-convolution filtering
Locally-connected layers
Fully-connected layers
Training

o The network receives images, each showing a face, and is trained on the SFC as a multi-class classification problem using a GPU-based engine, implementing the standard back-propagation on feed-forward nets by stochastic gradient descent (SGD).

o The net includes more than 120 million parameters which took three days to train for roughly 15 epochs.

Face verification metrics

Results

Network was trained on the Social Face Classification(SFC) dataset. That seems to be a Facebook-internal dataset with 4.4 million faces of 4k people each with 800 to 1200 faces, where the most recent 5% of face images of each identity are left out for testing.

Reference

[1] Taigman, Y., Yang, M., Ranzato, M. A., & Wolf, L.(2014). Deepface: Closing the gap to human-level performance in face verification. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1701-1708).

All results and images are directly taken from the reference paper, for the purpose of better understanding.