Machine learning is a computer studies discipline that aims at instilling human-like intelligence into computers by training them on abilities such as computer vision, natural language processing, pattern recognition etc. 2016 was a year where we saw great improvement in this field with machine’s ability to outdo and dominate human beings in tasks such as image processing, self-driving cars, stock market algorithmic trading and medical diagnosis. Fire up your favorite browser and type machine learning in your preferred browser and you will notice that the world is blazing hot with this exciting field.

The number of “machine learning” searches has skyrocketed in recent years, as Google Trends shows us:

Source: Google Trends

Source: Google Trends

In this tutorial I aim to quench a subset of your machine learning thirst; And for what other reason would you be here? This tutorial is based on training a computer to recognize not just your face but the emotion expressed on it. How cool is that? Imagine walking into your home after a long day of work, and your computer immediately knows what kind of therapeutic music to play based on how you are feeling, or when you are driving the car onboard computer is able to assess your ability to drive based on your emotions. Enough of this introductory gospel, let’s get into it!

To successfully get most out of this tutorial you need the following:

  • An understanding of python basics
  • A successful installation of OpenCV. here
  • Facial emotions dataset, simply the dataset containing a facial images expressing different emotions

OpenCV is a versatile computer imagery processing package with a wide variety of applications such as object detection in still and motion videos, facial recognition. Here we apply OpenCV facial recognition capabilities by training it to recognize individual emotions through image processing. This is enabled by using openCV internal face recognizer classes. The package is primarily written in C++ using an algorithm known as fisher faces. Even though its written in C++ this tutorial is based on python.

Dataset download and processing

Dataset download link.

The dataset comes in two folders composed of images and emotions encoded into the text files respectively. The images’ emotions are numerically encoded as: 0=neutral, 1=anger, 2=contempt, 3=disgust, 4=fear, 5=happy, 6=sadness, 7=surprise. Extract the dataset and please have a look at a few images and their respective emotions.

Next we need to clean up our data and organize it in a structured manner suited for our emotion processing scripts. Create a folder known as master. This is the folder that will house all your scripts and the data. Inside this folder create two subfolders; emotions and images respectively.

Extract all the folders in your dataset containing text files with emotions into the folder emotions. These text files are labeled as S005, S010, etc. And finally extract all the images containing folders into images folder.

Next we will sort our images based on some selected emotions, which we will work with throughout this tutorial.

Create a folder known as selected_set inside our master folder. Inside selected_set folder create subfolders each labeled as different emotions mentioned above e.g. disgust, neutral, surprise, etc. In the readme file in the dataset you downloaded, it’s claimed that archetypical emotions are contained only in a subset of emotion sequences. It’s organized in a such a manner such that each image sequence involves an emotional expression composed of a neutral face first and terminated with an actual emotion represented by that particular image sequence. In this tutorial we will only deal with two images; the first image in the sequence which is neutral and one with and the actual emotion which is the last image. This is easily done using a short helper python script we will write. Inside the master folder create a python script and name it img_seq.py. You can choose to type or copy paste the following code into it. But typing is recommended for learning the code structure.

import glob as gb
from shutil import copyfile

emotions_list = ["neutral", "anger", "contempt", "disgust", "fear", "happy", "sadness", "surprise"] 
emotions_folders = gb.glob("emotions\\*") #Returns a list of all folders with participant numbers

def imageWithEmotionEtraction():
    for x in emotions_folders:
        participant = "%s" %x[-4:] #store current participant number
        for sessions in gb.glob("%s\\*" %x): 
            for files in gb.glob("%s\\*" %sessions):
                current_session = files[20:-30]
                file = open(files, 'r')
                
                emotion = int(float(file.readline())) 
                #get path for last image in sequence, which contains the emotion
                sourcefile_emotion = gb.glob("images\\%s\\%s\\*" %(participant, current_session))[-1] 
                #do same for neutral image
                sourcefile_neutral = gb.glob("images\\%s\\%s\\*" %(participant, current_session))[0] 
                #Generate path to put neutral image
                dest_neut = "selected_set\\neutral\\%s" %sourcefile_neutral[25:] 
                #Do same for emotion containing image
                dest_emot = "selected_set\\%s\\%s" %(emotions_list[emotion], sourcefile_emotion[25:]) 
                
                copyfile(sourcefile_neutral, dest_neut) #Copy file
                copyfile(sourcefile_emotion, dest_emot) #Copy file
if __name__ == '__main__':
    imageWithEmotionEtraction()                

The next step in our data organization is extraction of facial images. At the end of this tutorial we will have an emotional classifier and for it to give us real world results, it’s important that all the images be of the same size with minimal background noise i.e. the image should contain almost only the face. This tasks us to locate the face on each image, convert it to grayscale, crop it and save the image to the dataset. This is where the OpenCV magic starts! OpenCV contains four HAAR filters and cascades that have already been pre-trained for facial detection and thus we need just to call it and get to find and filter image faces for us. We will make use of the four classifiers in a sequence so as to detect as many images as possible.

You need the four classifiers in your master directory. You can get them from the OpenCV directory installed in your machine already as follows:

  • haarcascade_frontalface_default.xml
  • haarcascade_frontalface_alt2.xml
  • haarcascade_frontalface_alt.xml
  • haarcascade_frontalface_alt_tree.xml

Create a script known as face_detection.py and place the following code into it. The script will detect, convert to grayscale, crop and finally save the image in its respective folder. Create a folder known as final_dataset. Inside this folder create subfolders for the respective mentioned emotions (neutral, angel, etc.). Then finally run your magic script face_detection.py.

import cv2
import glob as gb

face_detector1 = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
face_detector2 = cv2.CascadeClassifier("haarcascade_frontalface_alt2.xml")
face_detector3 = cv2.CascadeClassifier("haarcascade_frontalface_alt.xml")
face_detector4 = cv2.CascadeClassifier("haarcascade_frontalface_alt_tree.xml")

emotion_list = ["neutral", "anger", "contempt", "disgust", "fear", "happy", "sadness", "surprise"] 

def faceDetection(emotion):
    files = gb.glob("selected_set\\%s\\*" %emotion) #Get list of all images with emotion

    filenumber = 0
    for f in files:
        frame = cv2.imread(f) #Open image
        gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) #Convert image to grayscale
        
        #Detect face using 4 different classifiers
        face1 = face_detector1.detectMultiScale(gray, scaleFactor=1.1, 
            minNeighbors=10, minSize=(5, 5), flags=cv2.CASCADE_SCALE_IMAGE)
        face2 = face_detector2.detectMultiScale(gray, scaleFactor=1.1,
         minNeighbors=10, minSize=(5, 5), flags=cv2.CASCADE_SCALE_IMAGE)
        face3 = face_detector3.detectMultiScale(gray, scaleFactor=1.1,
         minNeighbors=10, minSize=(5, 5), flags=cv2.CASCADE_SCALE_IMAGE)
        face4 = face_detector4.detectMultiScale(gray, scaleFactor=1.1,
         minNeighbors=10, minSize=(5, 5), flags=cv2.CASCADE_SCALE_IMAGE)

        #Go over detected faces, stop at first detected face, return empty if no face.
        if len(face1) == 1:
            facefeatures = face1
        elif len(face2) == 1:
            facefeatures == face2
        elif len(face3) == 1:
            facefeatures = face3
        elif len(face4) == 1:
            facefeatures = face4
        else:
            facefeatures = ""
        
        #Cut and save face
        for (x, y, w, h) in facefeatures: #get coordinates and size of rectangle containing face
            print "face found in file: %s" %f
            gray = gray[y:y+h, x:x+w] #Cut the frame to size
            
            try:
                out = cv2.resize(gray, (350, 350)) #Resize face so all images have same size
                cv2.imwrite("final_dataset\\%s\\%s.jpg" %(emotion, filenumber), out) #Write image
            except:
               pass #pass the file on error
        filenumber += 1 #Increment image number

if __name__ == '__main__':
    for emotion in emotion_list: 
        faceDetection(emotion) #Call our face detection module

In the neutral folder its evident that we have more than one image of the same person expressing more than one emotion. This has a potential of introducing bias in our classifier accuracy in emotion detection and thus we need to clean it up.

Training and classification

Now our dataset is ready and we need to build our classifier for facial emotional recognition and train it using the dataset we have processed. It’s important to note that this is a field of machine learning and in machine learning tasks it’s a well-known gospel that for successful model to be born and validated we usually need two distinct datasets, training and validation sets. So we divide our data in a ratio of 0.67 for training set and 0.33 for validation set. The training set we use it to train our classification model (yet to be build) and use the validation set to assess the performance of our model on generalization of its recognition capability to new and the dataset it has not been trained on. Create a script known as classifier.py and place the following code in it.

#The Emotion Face detection Scripts
#You can modify this script as you wish
import cv2
import glob as gb
import random
import numpy as np
#Emotion list
emojis = ["neutral", "anger", "contempt", "disgust", "fear", "happy", "sadness", "surprise"] 
 #Initialize fisher face classifier
fisher_face = cv2.createFisherFaceRecognizer()

data = {}
#Function defination to get file list, randomly shuffle it and split 67/33
def getFiles(emotion): 
    files = gb.glob("final_dataset\\%s\\*" %emotion)
    random.shuffle(files)
    training = files[:int(len(files)*0.67)] #get first 67% of file list
    prediction = files[-int(len(files)*0.33):] #get last 33% of file list
    return training, prediction

def makeTrainingAndValidationSet():
    training_data = []
    training_labels = []
    prediction_data = []
    prediction_labels = []
    for emotion in emojis:
        training, prediction = getFiles(emotion)
        #Append data to training and prediction list, and generate labels 0-7
        for item in training:
            image = cv2.imread(item) #open image
            gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) #convert to grayscale
            training_data.append(gray) #append image array to training data list
            training_labels.append(emojis.index(emotion))
    
        for item in prediction: #repeat above process for prediction set
            image = cv2.imread(item)
            gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
            prediction_data.append(gray)
            prediction_labels.append(emojis.index(emotion))

    return training_data, training_labels, prediction_data, prediction_labels

def runClassifier():
    training_data, training_labels, prediction_data, prediction_labels = makeTrainingAndValidationSet()
    
    print "training fisher face classifier suing the training data"
    print "size of training set is:", len(training_labels), "images"
    fisher_face.train(training_data, np.asarray(training_labels))

    print "classification prediction"
    counter = 0
    right = 0
    wrong = 0
    for image in prediction_data:
        pred, conf = fisher_face.predict(image)
        if pred == prediction_labels[counter]:
            right += 1
            counter += 1
        else:
            wrong += 1
            counter += 1
    return ((100*right)/(right + wrong))

#Now run the classifier
metascore = []
for i in range(0,10):
    right = runClassifier()
    print "got", right, "percent right!"
    metascore.append(right)

print "\n\nend score:", np.mean(metascore), "percent right!"

After running the script in my machine is got a classification accuracy of 67-69% which is pretty good performance considering we have 8 emotional categories we are dealing with. But this can be improved a lot through optimization techniques.

Model optimization and enhancement

One of the greatest limitation of machine learning models’ performance is lack of enough data to train on. Data creation is expensive especially when it involves real world tasks such as this. In our emotion recognition quest, we face the same devil of data limitation. If you closely examine the dataset we you will notice we have 18 example for “contempt” category, 25 for “fear” and 28 for “sadness”. This could be a likely reason that hinders our model to generalize well on the new datasets as it has been inadequately trained on those categories.

So in our code, change the emotions list to exclude the three categories as:

emotions = ["neutral", "anger", "disgust", "happy", "surprise"]

On running the script again, I get correct prediction rate of 79-83%. This is a huge boost from our earlier maximum of 69% prediction rate. This confirms the huge limitation of data shortages in machine learning models.

The dataset we are using is highly optimized and standardized for this task and you wouldn’t be entirely wrong to guess that this makes it easy on our model prediction capabilities.

You may be wondering that, what if I feed the model with one of my various emotional faces? How would it perform?

Well you don’t have to feed the model with your personal images, for security reasons! Ha! Of course you can use your personal images but it would be long tedious task of capturing images of your various emotional faces. Why don’t we turn to our friend google and do a batch download of images, clean them up and we are good to go. There is a lot of online materials on batch image download. Note, the most important thing to consider when dealing with external images is to make sure that on each image there is text overlays on the face, the emotion represented by the image is recognizable and the face is directly facing the camera. Then repeat the process of face cropping script in our face_dedection.py script and finally generate standardized images ready to feed into our classifier model.

You can download another dataset here, if you need something ready to use and are curious about one of the first work on facial expressions recognition with machine learning.

Merge this with our final dataset and run your model again with emotions list as:

emotions = ["neutral", "anger", "contempt", "disgust", "fear", "happy", "sadness", "surprise"]

Play with the model and you should be able to get an accuracy of above 60% which is a great performance considering we went to more “noisy” real world data.

Now it’s time to apply the script in detecting your spouse’s emotions and keeping them happy always, ;).


The code in this tutorial is licensed under the GNU 3.0 open source license. Much of code was inspired by van Gent, P. (2016). Emotion Recognition With Python, OpenCV and a Face Dataset. A tech blog about fun things with Python and embedded electronics. Retrieved from:
http://www.paulvangent.com/2016/04/01/emotion-recognition-with-python-opencv-and-a-face-dataset/


Image dataset by: Michael J. Lyons, Shigeru Akemastu, Miyuki Kamachi, Jiro Gyoba.
Coding Facial Expressions with Gabor Wavelets, 3rd IEEE International Conference on Automatic Face and Gesture Recognition, pp. 200-205 (1998).


I also used the Cohn-Kanade database, get it here.

On the same topic, I found very good the book “OpenCV with Python Blueprints: Design and develop advanced computer vision projects using OpenCV with Python” by Michael Beyeler, and in particular the Chapter 7.

Posted by lorenzo

Full-time engineer. I like to write about data science and artificial intelligence.

4 Comments

  1. Great tutorials. Thanks a lot!

    Reply

  2. after running the face_detection.py i cant find any images in final dataset folder. what can be the issue here?

    Reply

  3. hi Mr Lorenzo
    I want to thunck you first for this great tuto. when I run this script I have this error

    Traceback (most recent call last):
    File “”, line 1, in
    AttributeError: ‘module’ object has no attribute ‘createFisherFaceRecognizer’

    what s the solution

    Reply

Vuoi commentare?