To integrate facial recognition into a new or existing application, you need:
- Video camera
- A powerful server to store data
- Detection, comparison and recognition algorithms
- Trained neural networks with access to images
In the age of smartphones where every person has a high-quality HD camera in their hands, businesses need not worry about how users will use facial recognition app. Video cameras for business use like in restaurants, convention centres, research facilities, etc. are also very affordable. Cloud Computing has made access to powerful servers that can store, process and serve high data volumes both accessible and economical.
It’s the last two components — algorithms and trained neural networks – that need to be worked around. Here is how these algorithms identify faces.
The first step is detecting faces present in the input provided. This input could be in the form of images or videos. Also, the system may be required to detect one or more faces. Face detection comes under the category of object detection. The system identifies an object as face and demarcates it, that is, localizes its extent with a box. Face detection is the most critical step because if a face is not detected, it can never be identified.
Data normalization / alignment
The faces detected in the previous step often need to be normalized so that they are consistent with the database. It is not necessary that the faces detected are always front facing. They could be side profiles or looking in different directions or shot under poor lighting. The system should be able to identify the face of a person even if pose, illumination and expression are different.
The next step in facial recognition is to extract features that can be used to identify the face. Here convolutional and autoencoder networks are used. Each database has a predefined set of features that must be extracted from each detected face so that it can be identified successfully.
In 1970, when Harman, Goldstein and Leask refined manual facial recognition systems, they used 21 facial markers like lip thickness, hairline, hair colour etc. to detect faces automatically. Modern algorithms extract 64 or 128 facial markers, also called embeddings. This is the step where alternate faces for the same collection of features can be generated for future reference.
This is the step where actual recognition happens by comparing the identified features with database. One must understand that practically there never is 100% match; each system has to define its own threshold above which the face will be considered recognized. Usually this is 80%. If there is a match of 80% or more, the system would return an identified status. Anything below this, the system will return an “unidentified” status.
Applications can choose to increase or decrease this threshold depending on their requirements. Usually military installations, sensitive research facilities, financial transactions, etc., which need very high level of security may increase the threshold. But for all other purposes the 80% threshold works just fine. When the threshold is decreased it may result in excess live and authorised people.