OpenCV & ML (Deep Learning) 02 – Developing a simple Mulit-layer perceptron (MLP)

Using Mulit-layer perceptron (MLP) in OpenCV for modeling a simple logical function. This example clarify the basic steps required for implementing the ANN_MLP. It also explains how to write a simple Makefile to compile the code. 

Let’s roll up our sleeves and develop our first neural network (a multi-layer perceptron) with OpenCV 3. If you find this page via search engine and you still have trouble installing OpenCV 3.3 you could refer to the previous tutorial on this subject.

We use artificial neural networks to train machines, real world functions. By using the term “real world functions”, I mean functions with too many parameters or those which we do not understand the inner formula and even parameters but we do know some of the inputs and their correspondence output. You could include most of Computer Vision problems in this category. Like, when we are watching an image we could exactly describe what we are looking at, but on the other side we are not aware of the exact procedure, so we have inputs and outputs.
Inputs and outputs (as training samples or data-sets) are exactly what we need to setup a working neural network. And since this is a tutorial I am going to start with some simple known functions in order to generate the inputs and outputs and then use some of them for training and the remain as a test set to see if the model learned the function. Test sets are used to find out if our model is capable of predicting some correct outputs from inputs that has never seen them before. The following truth-table is initialized using the function y = x_0 \oplus x_1 \land x_2, which \oplus stands for XOR and \land stands for logic AND

x_0 x_1 x_2 y
0 0 0 0
0 0 1 1
0 1 0 0
0 1 1 0
1 0 0 0
1 0 1 0
1 1 0 0
1 1 1 1

The above table includes both input and output data. Each row in deed is a training sample for our network. Columns x_0, x_1 and x_3 are input data and column y is the correspondence output. We will use 6 out of 8 samples we have to train the network and we will use the rest for the test purposes. In OpenCV we always use matrices to represent our data. In this example we’ll use matrix X with 3 columns and 8 rows for  storing input data and matrix Y with 1 column and 8 rows for output data. Take a look at the following example try to figure out how this code illustrate a Multi-layer perceptron  and I’ll discuss about it later.

#include <opencv2/opencv.hpp>

using namespace std;
using namespace cv;
using namespace cv::ml;

int main() {

    // y = (x0 XOR x1) AND x2
    Mat in = (Mat_<float>(8, 3) <<
                   0.f, 0.f, 0.f,
                   0.f, 0.f, 1.f,
                   0.f, 1.f, 0.f,
                   0.f, 1.f, 1.f,
                   1.f, 0.f, 0.f,
                   1.f, 0.f, 1.f,
                   1.f, 1.f, 0.f,
                   1.f, 1.f, 1.f);

    Mat out = (Mat_<float>(8, 1) <<
                    0.f,
                    1.f,
                    0.f,
                    0.f,
                    0.f,
                    0.f,
                    0.f,
                    1.f);

    Ptr<ANN_MLP> net = ANN_MLP::create();

    // 3 is the hidden layer size
    Mat layerSizes = (Mat_<int>(3, 1) << in.cols, 3, out.cols);
    net->setLayerSizes(layerSizes);

    net->setActivationFunction(ANN_MLP::ActivationFunctions::SIGMOID_SYM);

    TermCriteria termCrit = TermCriteria(
        TermCriteria::Type::COUNT + TermCriteria::Type::EPS,
        1e5, 1e-15);
    net->setTermCriteria(termCrit);

    net->setTrainMethod(ANN_MLP::TrainingMethods::BACKPROP);

    Ptr<TrainData> trainingData = TrainData::create(
        in.rowRange(0, 6),
        SampleTypes::ROW_SAMPLE,
        out.rowRange(0, 6)
    );

    net->train(trainingData);

    Mat result;
    net->predict(in, result);

    cout << result << endl;

    return 0;
}

 

Let’s run the above code first and get the results. The best way to run the code is to create a Makefile. Suppose you stored the above code in a file named “simplemlp.cpp”. At the same folder create a new file with the name of Makefile without extension with the following content:

CPP = g++
CXXFLAGS = -std=c++11 -O3
INC = -I/usr/local/opencv3.3-cpu/include
LIBS = -L/usr/local/opencv3.3-cpu/lib
LIBS += -lopencv_core -lopencv_ml

TARGETS = simplemlp

.DEFAULT: all

.PHONY: all debug clean

all: $(TARGETS)

debug: CXXFLAGS += -g

simplemlp: simplemlp.cpp
  $(CPP) $(CXXFLAGS) $(LIBS) $(INC) simplemlp.cpp -o simplemlp

clean:
  rm -f $(TARGETS) *.o

In the above Makefile look at the path of include and library directory we provided in the 3rd and 4th lines. These are the paths we used in configuring OpenCV CMake as installation target in the previous tutorial. If you haven’t change the installation directory simply overlook these lines in your Makefile. Since for this example we have used only machine learning and core functionality of OpenCV we only provide links to these two library objects (files with .lib extension) in the Makefile (opencv_core & opencv_ml). If you named your cpp file a different name remember to update it in the Makefile. after saving the Makefile you need to run the make command in the directory where your Makefile and source code exist.

terminal

  • make

A binary file with the name of “simplemlp” will be created. In ordinary cases (if have not changed OpenCV installation directory) in order to run a binary file we only need to type ./simplemlp in the terminal but since the built libraries are not in the system wide or user environment path we should help Linux to find the required shared objects to execute the file. Otherwise it will fail to execute with a message that it couldn’t find the file <somename>.so. Provide the library path to Linux shell using the following format:

terminal

  • LD_LIBRARY_PATH=/usr/local/opencv3.3-cpu/lib ./simplemlp

As you execute the program the results will show up something like below:

[2.774793e-07;
 0.99999988;
 -1.5345516e-07;
 5.4139864e-08;
 -1.950926e-07;
 6.7599757e-08;
 0.085034117;
 -0.15362477]

You might notice that the values for the trained samples are almost correct and the results for the test samples in which the network faced them for the first time (the last two rows) are not bad and at the same time not good. This is because we lack enough training samples number to train our model. in the latter examples we will correct this issue.
Now it ‘s time to talk about the code before getting into the next example. I am not going to describe the whole syntax especially those relating to the core modules such as matrix initialization. I believe you could figure out what’s happening about the basic syntax and if you have concerns you could Google or check the official documents to learn more.
We store the input data in matrix named ‘in’ with the floating point elements and dimension of 8×3 and the output data matrix named ‘out’ with dimension of 8×1. It is mandatory for the elements to be 32 bit floating point and not 64 bit double. Obviously, we have 8 samples and each sample is consist of 3 inputs and 1 outputs, so these are number of neurons in the input layer and output layer respectively. Next, We created an object of class cv::ml::ANN_MLP representing Artificial Neural Network – A Multi Layer Perceptron Network. The object name is ‘net’. Now we need to set the structure and design of our network to say how many layers we need and how many neurons in each layer. This will be done thru a definition of one dimensional matrix and it is not important for this matrix to be in one row or one column. OpenCV will consider all the number of elements (rows or columns) as number of layers and the value of each element as the number of neurons for that layer. We know about the number of first and last layer which are equal to number of columns in our input and output samples. The only thing is to set number of neurons in the hidden layer(s). Since, we do not have so many samples to use for training, one hidden layer with three neurons would be enough. Our network should look like this after we set the layers size.

We apply a Symmetric Sigmoid function to the weighted sum of inputs by the setActivationFunction(). Activation function plays an important role in your model. Symmetric Sigmoid (Hyperbolic Tangent) f(x)=\beta*(1-e^{-\alpha x})/(1+e^{-\alpha x})=\alpha tanh(\beta x) is different from standard sigmoid (Logistic function) f(x)=\beta/(1+e^{-\alpha x}) The Symmetric Sigmoid range is [-1.7159, 1.7159] and that of the Logistic function is [0,1]. Symmetric Sigmoid by far is the better choice for back propagation optimization. Read paper by LeCun et al. for more information. The weights will be learned during the training phase. The two arguments Param1 and Param2 corresponds to the \alpha and \beta in the activation function By skipping them we agree to their default values which are zero and in that case \alpha and \beta will be set to 1.7159 and 2/3 respectively

For all the iterative functions in OpenCV we need to set a termination condition, the setTermCriteria() is the method we use to declare a condition in which the algorithm should be terminated. The condition could be the maximum number of iterations or a very small value (epsilon) as the minimum change in the improvement of error for each step to proceed to the next one. If the error change in one of steps becomes smaller than this value the iteration will be terminated. Here we use both of the conditions which means terminate the iteration if any of them occur sooner. The setTrainMethod() is used to set the training method. The two available methods at this version of OpenCV are BACKPROP (the famous one) and RPROP (the efficient one). In most cases the BACKPROP produces better results and the RPROP converge to the results much faster.

The cv::ml::TrainData::create() will create the training data. As you could see I used only the first six rows of samples in order to create the training data. The range parameters are inclusive, exclusive. I keep the last two for the test purposes. In the next tutorial we will use an API provided by OpenCV for separating training data from test data. The second argument is used to illustrate that our samples are stored per rows rather than columns. The next step is to call the train() method of the network which is the most time and resource consuming function of the program. In this phase the weights will be learned thru optimization process by the network until any of the termination conditions take place. We use predict() method to see how our model is performing. All the of the samples are predicted including those used in training and those not used. As you can see in the results, the network completely learned those samples of the training set and not performing bad on those we didn’t use for training (justification: the last sample should not be zero and it has more distance from zero relative to the other sample which should be zero). Now you could change the parameters by yourself to see how they affect the results. In the next tutorial we will review another example but this time more practical one.

Source codes of this tutorial on Github

Cite this article as: Amir Mehrafsa, "OpenCV & ML (Deep Learning) 02 – Developing a simple Mulit-layer perceptron (MLP)," in MEXUAZ, September 11, 2017, http://mexuaz.com/opencv-machine-learning-02/.

If you have found a spelling error, please, notify us by selecting that text and pressing Ctrl+Enter.

Add a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Spelling error report

The following text will be sent to our editors: