Welcome to the first tutorial. In this tutorial, you would start with a fundamental task. Building your dataset.

What is a dataset?

A dataset is a set of images that you give as input to your training algorithm. Each input image in your dataset is given a label(eg. face, background, bicycle, ball etc.). This helps the algorithm give a name to each type of entity that it learns.

Building your first dataset in EBLearn

To build your dataset,

  • first download these set of images of handwritten digits.

mnist.zip

  • Extract the zip file to some folder, for example /home/rex/
  • You should now have a folder named /home/rex/mnist/train and /home/rex/mnist/test, each with a set of folders with the names 0,1,2,3,4,5,6,7,8,9.

To build your dataset in eblearn, we shall use a tool in eblearn called dscompile This tool will compile your dataset from images to a format understood by eblearn. Your images have to be folder-separated, with each folder name being the label given to those images.

For example, when you are building a digit recognition system. There are 10 digits from 0…1…2……9 Hence, you have to have a folder structure such as shown in Figure 1. Another example would be a face detection system where you have images of faces and background images in two separate folders.

Figure 1:An example of a folder structure that should be given as input to dscompile. Here, each digit is going to be assigned a label corresponding to its value

An important point to note is that our classifier will have a fixed input size, means that all our input images have to be the same size. dscompile takes care of this by automatically resizing all input images to a given dimension. Let us set all our images to be a fixed size of 32x32x1. Once you have your dataset folder containing your images in the defined folder structure, call dscompile with the following usage.

dscompile <DATASET_ROOT> -outdir <OUTPUT_PATH> -dname <NAME> -dims [height]x[width]x[channels] \
-kernelsz [preprocessing kernel size] -channels [preprocessing-type]

In the above command, [channels] refers to the number of color channels in your input images. If they are color images, you can use 3 and if they are black and white images, you can use the value 1.

In general, you can choose a preprocessing that you find appropriate. Some of the preprocessings available are showcased in an image at the bottom of this tutorial. For example, a usage would be

dscompile /home/rex/mnist/train -outdir /home/rex/eb_dataset \
-dname mnist_train -dims 32x32x1 -kernelsz 7x7

This should provide you with three files which are named

mnist_train_data.mat, mnist_train_labels.mat, mnist_train_classes.mat

you can verify that your dataset was correctly compiled by using the command

dsdisplay <NAME>

in our example it would be

dsdisplay /home/rex/eb_dataset/mnist_train

You would see your resized input images and it would look something like this

A screenshot of dsdisplay on an example dataset

To test our classifier's performance, we shall also build a test set, which will not be using. We can build our test set in a similar fashion by using the command

dscompile /home/rex/mnist/test -outdir /home/rex/eb_dataset \
-dname mnist_test -dims 32x32x1 -kernelsz 7x7

Congratulations! You finished compiling your train and test dataset into EBLearn's format. Now comes the exciting part where you do some cool machine learning.

Next:Tutorial 2: Creating and training a simple network

Preprocessing Showcase

Different preprocessing parameters for roadsign examples
Image operations testing (from tester tool)
beginner_tutorial1_dscompile.txt · Last modified: 2012/10/05 23:11 by soumith