To run the challenge we introduce NoW Dataset. The dataset contains 2054 2D images of 100 subjects, captured with an iPhone X, and a separate 3D head scan for each subject. This head scan serves as groundtruth for the evaluation. The subjects are selected to contain variations in age, BMI, and sex (55 female, 45 male).
We categorize the captured data in four challenges; neutral (620 images), expression (675 images), occlusion (528 images) and selfie (231 images). Neutral, expression and occlusion contain neutral, expressive, and partially occluded face images of all subjects in multiple views, ranging from frontal view to profile view. Expression contains different acted facial expressions such as happiness, sadness, surprise, disgust, and fear. Occlusion contain images with varying occlusions from e.g. glasses, sunglasses, facial hair, hats or hoods. For the selfie category, participants are asked to take selfies with the iPhone, without imposing constraints on the performed facial expression. The images are captured indoor and outdoor to provide variations of natural and artificial light. We provide the crop information for face region in the Downloads page.
For each subject we capture a raw head scan in neutral expression with an active stereo system (3dMD LLC, Atlanta). The multi-camera system consists of six gray-scale stereo camera pairs, six color cameras, five speckle pattern projectors, and six white LED panels. The reconstructed 3D geometry contains about 120K vertices for each subject. Each subject wears a hair cap during scanning to avoid occlusions and scanner noise in the face or neck region due to hair.
The challenge for all categories is to reconstruct a neutral 3D face given a single monocular image. Note that facial expressions are present in several images, which requires methods to disentangle identity and expression to evaluate the quality of the predicted identity.