dscompile

by Pierre Sermanet (January 10th, 2011)



dscompile assembles preprocessed samples for training and testing purposes. It accepts a variety of input format, from simple directory structure to Xml files describing objects bounding boxes (e.g. PASCAL VOC format). See an example of sample extraction in this video.

Calls

  • ./dscompile <images root> [options]

Outputs

Output files (datasets/images) are saved by default in the calling directory or in the directory specified via '-out' option. The outputs can be saved either as single-file datasets (default) or individual files for each sample (can be set via '-save' option).

When saving a dataset in single-file mode, the following files are created (with a dataset named 'ds'):

  • ds_data.mat: a matrix of dimensions (nsamples x nchannels x height x width), with precision determined by the '-precision' option (default is float).
  • ds_labels.mat: a matrix of dimensions (nsamples) containing the class index of each sample.
  • ds_classes.mat: a matrix of dimensions (nclasses x max class name length) containing the names of each class.

Examples

  • Jitter (all kinds): to add jitter to a dataset, i.e. original images deformed with translation, scaling or rotations, use the 'jitter' option. E.g. to add 7 jittered examples for each original input, randomly chosen among all possible jitters, with height translation range of 1, width translation range of 2, 3 scale steps in a +/- .2 scale factor range, 5 rotation steps within a +/- 30 degree range around original orientation:
    ./dscompile <root> -jitter 1,2,3,.4,5,60,7
  • Jitter (rotations): to add rotation jitter to a dataset, i.e. original images and rotations of those, use the 'jitter' option with rotation variables only. E.g. to add 7 rotated examples for each original input, randomly chosen among 30 possible angles in a 60 degree range (step of 2 degrees), i.e. 30 rotation steps within a +/- 30 degree range around original orientation:
    ./dscompile <root> -jitter 0,0,0,0,30,60,7
  • Complete example in INRIA pedestrian scripts:

Options

  • -type <regular(default)|patch|pascal|pascalbg|pascalfull|grid>
    • regular: compile images labeled by their top folder name
    • patch: extract random (position & scale) patches from images
    • pascal: compile images labeled by xml files (PASCAL challenge)
    • pascalbg: compile background images of PASCAL challenge
    • pascalclear: clear objects from original images of PASCAL challenge
    • pascalfull: copy full original PASCAL images into outdir (allows to exclude some classes, then call regular compiler)
    • grid: extract non-overlapping cells from each image. Cell sizes are determined by -gridsz option
  • -precision <float(default)|double|ubyte>
  • -annotations <directory>
  • -image_pattern <pattern>
    default: .*[.](png|jpg|jpeg|PNG|JPG|JPEG|bmp|BMP|ppm|PPM|pnm|PNM|pgm|PGM|gif|GIF|mat|MAT)
  • -channels <channel>
    channels are: RGB (default), YpUV, HSV, Yp (Yp only in YpUV)
  • -disp
    Display extraction
  • -nopp
    No preprocessing, i.e. no resizing or conversion.
  • -sleep <delay in ms>
    Sleep between frame display.
  • -shuffle
  • -usepose
    Separate classes with pose if available.
  • -stereo
  • -stereo_lpattern <pattern>
  • -stereo_rpattern <pattern>
  • -outdir <directory (default=images_root)>
  • -load <dataset name>
    This loads the dataset instead of compiling it from images found in root.
  • -save <dataset(default)|mat|ppm|png|jpg|...>
  • -dname <name>
  • -maxperclass <integer>
  • -maxdata <integer>
  • -kernelsz <integer>
  • -mexican_hat_size <integer>
  • -deformations <integer>
  • -dims <dimensions (default: 96x96x3)>
  • -mindims <dimensions (default: 1x1)>
    Exclude inputs for which one dimension is less than specified.
  • -scales <scales (e.g: 1.5,2,4)>
  • -bboxfact <float factor>
    Multiply bounding boxes by a factor.
  • -bboxhfact <float factor>
    Multiply bboxes height by a factor.
  • -bboxwfact <float factor>
    Multiply bboxes width by a factor.
  • -bbox_woverh <float factor>
    Force w to be h * this factor.
  • -resize <mean(default)|gaussian|bilinear>
  • -exclude <class name>
    Include all but excluded classes, exclude can be called multiple times.
  • -include <class name>
    Exclude all but included classes, include can be called multiple times.
  • -useparts
    Also extract object parts, e.g. person->(head,hand,foot.
  • -partsonly
    Only extract object parts, e.g. person->(head,hand,foot.
  • -ignore_difficult
    Ignore sample if "difficult" flag is on.
  • -ignore_truncated
    Ignore sample if "truncated" flag is on.
  • -ignore_occluded
    Ignore sample if "occluded" flag is on.
  • -nopadded
    Ignore padded image too small for target size.
  • -jitter <h>,<w>,<nscales>,<scale range>, <nrotations>,<rotation range (in degrees)>,<n>
    Add n samples randomly jittered from spatial neighborhood hxw, nscales within scale_range and nrotations within rotation range around original location/scale)
  • -wmirror
    Add mirrored sample using vertical-axis symmetry.
  • -forcelabel <label name>