Following on from my last post where I introduce Project Oxford I’ve done a bit more work to take the project that was built and make it more visual. To summarise, Project Oxford is a set of APIs that build on top of Azure ML to provide Face, Speech, Computer Vision and Language Understanding Intelligence Service (LUIS). There was a good video from Build 2015 that I watched to provide an overview of each of the APIs.
I used the tutorials to build an application that would identify a number of people from a known list in a photograph and highlight the ones that were unknown. The Face API requires people to be trained with a set of photos first, before identification can be made. This was done by using the code in the samples. I created a folder for each person that I wanted to be trained and added different photos of each person with and without hats, and sunglasses and also with different expressions. Then each set of folders was passed to the training API. Once trained you can then use the rest of the Face API to firstly identify faces in a picture and then take each face that is found and see if they are known.
One useful tip I’ve found is to have Fiddler running whilst you are debugging as it is far easier to see any errors in the body of the response message than in the exceptions that are thrown. Details of the errors can be seen in the Face API documentation.
The process for training is as follows (Note the terminology is based around the SDK methods, but I’ve linked to the API page as this gives details about the errors etc):
- Create a Person Group
- Create a Face list for each person using Face Detect
- Create a Person one for each person you want to identify with the person group id and face list
- Train the Person Group
Note: The training does not last forever and you will need to redo it periodically. If you try and detect a person when training has expired then you will get an error response saying that the person group is unknown.
To Identify each individual in a photograph:
- Stream the photograph into Detect. This will return a list of faces with face ids
- Iterate around each Face and call Identify
- Use the Identify Results to extract the names by calling Get Person.
This is where I got to with the previous post, but this wasn’t very visual and as I was working with photographs I thought it would be useful to use the data returned to draw a box around the faces that were identified and add the name of the person underneath. This was also useful to know which person was identified incorrectly. On the project Oxford web site there was the following image
I wanted to emulate this and also to take it one step further. The data returned from the face detection API provides details about gender, age, the area (face rectangle) in the picture where the face was found, face landmarks, and head pose. What the detection API did not do was to tie the name of the person to the face. We do already have this information as it was returned from the Identify API and Get Person. The attribute that links them is the face id. Using the results of the Identify API I called get person for each face identified to return the person’s name and stored this in a Dictionary along with the face ID. This then allowed me to load the original photograph into memory draw the rectangles for each face and add the text below each using the face id to extract the rectangle and match the name from the Dictionary, This could then be scaled shown in the app.