Datasets used for training are single files containing all data (single-files datasets are easier and safer to handle than scattered files). A dataset named 'mydata' is typically composed of the following files:
Depending on the data, some of these files may not be produced. These files are produced using the following tools:
Several calls to the dataset tools might be necessary to create training, validation and testing sets with positive and negative examples for example. As an example, these calls are grouped in the following Shell script for face detection data extraction: dsprepare.py.
When using dscompile in 'regular' mode (default), the object is assumed to fill up the entire image. One can specify bounding boxes around objects of interest if these are located in a sub-region of the image. For this, the 'pascal' mode (-type pascal) reads XML annotations files associated with each image. The XML follows the format of the PASCAL Object Recognition challenge.
Here is an XML example of image 'image0.png' located in folder '/data/images/' of size 480×640, cropped from top-left corner at 10×10 to bottom-right corner at 400×400, with 2 objects 'person' and 'car' (each have a visible region, as opposed to the 'bndbox' region which contains the entire object, potentially including occluded regions):
<annotations> <folder>/data/images/</folder> <filename>image0.png</filename> <size> <width>640</width> <height>480</height> <depth>3</depth> </size> <crop> <xmin>10</xmin> <ymin>10</ymin> <xmax>400</xmax> <ymax>400</ymax> </crop> <object> <name>person</name> <bndbox> <xmin>467.593577</xmin> <ymin>156.004541</ymin> <xmax>485.583166</xmax> <ymax>188.118130</ymax> </bndbox> <visible> <xmin>467.593577</xmin> <ymin>156.004541</ymin> <xmax>485.583166</xmax> <ymax>188.118130</ymax> </visible> </object> <object> <name>car</name> <bndbox> <xmin>566.166985</xmin> <ymin>157.585411</ymin> <xmax>608.988303</xmax> <ymax>208.727668</ymax> </bndbox> <visible> <xmin>566.166985</xmin> <ymin>157.585411</ymin> <xmax>608.988303</xmax> <ymax>208.727668</ymax> </visible> </object> </annotations>