A month ago, Apple announced FaceID with the new iPhone X. FaceID allows you to authenticate yourself using facial recognition. We know that it's nothing new compared to other smartphones that already have that feature for a while, but it seems to bring new capabilities for robust face tracking in apps.
We all know that Snapchat started the hype around face masks but face tracking can be used far behind that. Imagine new user experiences according to your facial features or the ability to track user emotion while looking at a certain content in the app. This could be interesting to understand if users are enjoying the content that is presented.
How it works
We've worked with some face recognition frameworks that mainly use Computer Vision to detect the user's face. To achieve something more powerful like live selfie effects or detect facial expressions to drive a 3D character, we need something more.
This was built using iOS Core Image detectors of type
Face for face recognition and it extracts some of the face features (mouth, left eye, right eye). This experiment is available here and it puts in practice the performance of realtime effects in a person face using the front face camera of an iPhone or iPad. It's possible to detect more features using Core Vision (iOS 11 only) like face contour, nose crest, lips, outer lips, left eyebrow, and right eyebrow, but it's still limited for robust face tracking in apps only knowing the position of those features.
We need something that Microsoft has done for years with theirs Microsoft Kinect's depth camera. For example, you'll need a camera that can accurately map the geometry of your face and track the muscle movements. That's basically what the TrueDepth front-facing camera of the iPhone X does.
Imagine an app that plays DIY videos and you want to understand if users are frustrated. That could be a sign that things aren't going well 😅, Or even Netflix where they could track how are people reacting to the recent content that it's shipped.
The way this could work is, once a face is detected (usually the closest face in the view of the camera), we try to detect the user's facial expressions, and then we analyse and predict human emotion. By knowing how the user face moves and raises an eyebrow, roles his/her eyes, smiles, and other information, it's possible to figure out how the user feels by using machine learning.
Playing a first-person shooter multiplayer game with each gamer face emulated in the virtual character seems to be so much fun.
There are frameworks (i.e. ARKit) where it's possible to track the face mesh and anchors in realtime, 60 times per second, giving the possibility to animate or rig a 3D character in a way that directly mirrors the user's facial movements. The face topology is formed from a detailed 3D mesh of a specific size to respect real world dimensions. A face anchor is necessary to provide the face pose in world coordinates and a dictionary of named coefficients representing the pose of specific features like eyelids, eyebrows, jaw, nose, etc.
We believe that face tracking will be very predominant in modern apps. There are plenty of use cases that can be explored. We just need to find the right use of it.
If you have any project where we can help you using AR, then do not hesitate and contact us.