libidx tutorial

The idx library provides support for tensor description and manipulation. It is self-contained and the foundation for the eblearn learning library. In addition to generic tensor operations, it provides a set of operations specific to images, which are described as 3-dimensional tensors.



Overview

The idx library (libidx) provides tensor (multi-dimensional arrays) descriptors and manipulators, basis to the eblearn library. This core library along with libeblearn is self-contained and templated, allowing compilation for any type and on any platform. Headers (.h) and templates (.hpp) files are located in core/libidx/include, source files (.cpp) in core/libidx/src.

There are five main components to the library:

  1. Tensor descriptors and iterators: they constitute the shallow and inexpensive tensor operations, declared in idx.h.
  2. Tensor content operators: these are generic deep operators that modify the actually data pointed to by the descriptor, declared in idxops.h.
  3. Image operators: these apply image-specific and deep operations such as rotations or edge extraction, declared in image.h.
  4. Tensor I/O: the loading and saving operations for tensors and images, declared in idxIO.h and imageIO.h.
  5. Tensor display: draw tensors on a 2-dimensional GUI, implemented as a separate library called libidxgui in tools/libidxgui/ in order to keep core libraries self-contained and indepedent of GUIs.



Tensors

Descriptor

The idx class can represent tensors up to 8 dimensions. It is templated in order to be compiled to work with any data type. For a floating point 8 by 4 matrix can be created as follow:

idx<float> m(8, 4);              // create a 2-dimensional float matrix

idx objects only describe tensors, they do not actually contain data. They can therefore be copied at low computation cost. Multiple tensors with different orders and dimensions can point to the same data, e.g.:

idx<float> m2 = m;               // same size and pointing to the same data as m
idx<float> m3 = m.select(0,0);   // a 1D slice of m pointing to the same data

In the above examples, m, m2 and m3 all point to the same data storage. A storage keeps a reference counter so that even if m, the idx which created it is deleted, it stays allocated until all refering idx (m2 and m3) are deleted.

Accessors

One can assign or retrieve the value of a tensor element via set() and get() methods:

m.set(34.7, 7, 1);        // set value of element (7,1) to 34.7
float x = m.get(7, 1);    // get value of element (7, 1)
Shallow manipulators

Shallow idx operations allow for example to slice or resize an existing idx, only modifying its dimensions or offsets. They are shallow in the sense that the pointed data is never modified and remain therefore efficient to compute. They merely derive new idx objects from existing ones to access the data in new ways.

  • select: return a new idx corresponding to a slice of the current idx with slice i in dimension d. In other words, if m is an idx of order 2 of size (10,4), m.select(0,3) will return an idx of order 1 containing the 4-th row of m.
idx<float> m(10, 4);
idx<float> slice = m.select(dimension, index);

For example: Suppose a rgb image, I of dimension 75×100. Effectively, the dimension of matrix containing color information for image is 75x100x3, where first two dimension represent index of pixel and three is for r, g, & b for each pixel. Now, to only select information about green color for every pixel, you can select the 2nd slice in last dimension, resulting in a 75×100 matrix:

idx<ubyte> I(75, 100, 3); // color image
idx<ubyte> slice_green = I.select(2, 1) // slice[1] of last dimension[2]
  • narrow: returns a clone of idx m where the size in the n-th dimension is reduced to s elements, offset by o. This can be used to select bands or rectangular areas in matrices.

So, the user can narrow a dimension d to size s starting at position p to create a 3-dimensional subset of the 3D tensor t:

idx <double> subset3d = t.narrow(dimension, size, offset);
idx<ubyte> I(75, 100, 3); // color image
idx<ubyte> bottom_half = I.narrow(0, 37, 38); // bottom half of image
  • transpose: It is merely changing the order of dimension of tensor i.e to change the way elements of the tensor are accessed via the get and set methods. It is inexpensive operation as no data is actually moved around, this merely manipulates the idx structure itself.

For example, to move the channels dimensions of an image to the first dimension, one can call:

int p[] = {2, 0, 1}
idx<ubyte> image_transpose = I.transpose(p);

Or simply:

idx<ubyte> A = I.shift_dim(2, 0);
  • unfold: returns an idx with an added dimension at the end of m obtained by ”unfolding” n -th dimension. More explicitely , successive rows in the last dimension will contain successive, possibly overlapping , subsets of size ksize of consecutive elements in the n -th dimension. The separation between successive subsets is step . Here is an example:

Suppose V is vector the [0 1 2 3 4 5 6 7],

V.unfold(0, 3, 1);  // unfold(n, size, step)

will produce the following “unfolded” matrix:

    	      [[ 0.00 1.00 2.00 ]
    	      [ 1.00 2.00 3.00 ]
    	      [ 2.00 3.00 4.00 ]
    	      [ 3.00 4.00 5.00 ]
    	      [ 4.00 5.00 6.00 ]
  	      [ 5.00 6.00 7.00 ]]

No data is copied or replicated in the process , i.e. the three occurences of the number 2 in the above example actually come from a single memory location. The size of the new dimension is ksize . If dimn is the size of the n -th dimension of m , the size of the returned idx in the n -th dimension is (dimn-ksize)/step + 1 . The values of dimn, ksize, and step must be such that the new size is an integer.

This essentially manipulates the mod array to make convolutions look like matrix-vector multiplies.The unfold function essentially allows to reduce a discrete convolution to a matrix-vector product. Here is an example:

    	      vector M = [ 1 1 0 2 3 4 2 0 ] and 
    	      vector kernel = [-1 2 -1]
    	      
    	      idx U = M.unfold(0 3 1);
    		    idx_m2dotm1( U kernel output);
    		
    		So, output = [ 1.00 -3.00 1.00 0.00 3.00 0.00 ]
    		
    		A subsampled convolution can be implemented by unfolding with a step larger than 1:
    
		Let m = [0 1 2 3 4 5 6 7 8] & kernel =  [1 2 1];
    		idx<double> z = m.unfold(0, 3, 2);
    		= [[ 0.00 1.00 2.00 ]
    		    [ 2.00 3.00 4.00 ]
    		[ 4.00 5.00 6.00 ]
    		[ 6.00 7.00 8.00 ]]
    		    idx-m2dotm1( z, kernel, output);
  		Finally, output  = [ 4.00 12.00 20.00 28.00 ]
  	  

Naturally , there is no real need for most programmers to use the unfold construct because the idx library contains efficient 1D and 2D convolutions.

	  idx<float> m(8, 4);       // create a 2-dimensional float matrix
	  m.set(34.7, 7, 1);        // set value of element (7,1) to 34.7
	  float x = m.get(7, 1);    // get value of element (7, 1)

All of these functions to create matrices take 0 to 8 integer argument that are the sizes in each dimension. LIBIDX provides variety of operation from basic advance to cover all commonly used linear algebra operations to operate on matrix. It also provides large variety of iterators to loop over for slices/elements of matrix.

An idx is merely an access structure that points to the data. Several idx can point to the same data. Underlying implementation used for implementing tensor really composed of two separate entities:

1.  storage or srg: which contains the following fields: a pointer to the actual data , an element type identifier (and the size thereof).
2.  index or idx points: to a srg and contains the following fields: the number of dimensions of the tensor , the size each dimension , the address increment from one element to the next in each dimension (called modulo ) , and the offset of the first tensor element in the storage . This structure allows multiple idx to point to the same srg thereby allowingthe data to be accessed in multiple ways without copying.

For example metadata stored for matrice M(10, 8, 4) in memory is:

	  idx: at address -1234116112
	  storage=-1243605552 (size=320)
	  idxspec -1234116104
	  ndim=3 offset=0
	  dim=[ 10, 8, 4] mod=[ 32, 4, 1]
	  footprint= 320 contiguous= yes

This is basic information which is maintained/updated for every idx(tensor) at any point of time.

Basic information which is maintained/updated for every idx(tensor) at any point of time.

  • Storage(Srg):We have starting address of storage at any point of time and number of elements stored inside that storage. Srg is a container for arrays of data. It contains a pointer to a dynamically allocated chunk of data, and a reference counter. Reference counter maintains the no of active idx currently pointing to storage. Storage is always allocated with new operator i.e. it is always maintained on heap. Because storage is always on heap it help in various size manipulation function of matrices such as resize etc. which is very commonly needed.
  • Index(or idxspec):It contains all the characteristics of an idx, except the storage. It includes the order (number of dimensions) the offset, and the dim and mod arrays. having an idxspec class separate from idx allows us to write generic manipulation functions that do not depend on the type of the storage. Noting two idx can point to same storage but still can have very different characteristics.

Important charatrerstics/values are:

  1. OFFSET(offset):It represent the numeric distance between first element of storage and index of first element in storage pointed by that matrix
  2. object.
  3. No. of Dimension(ndim):No. of Dimension
  4. No. of elements: Product of element in each Dimension
  5. Modulo:the address increment from one element to the next in each dimension (called modulo )
  6. FOOTPRINT:Index one past of last element in storage of matrix can be calculated as: mod[0] * dim[0] + offset; For footprint of ith dimension: offset + mod[i]*dim[i]
  7. Contigous: Is memory contigously allocated or not. We will observe that not all matrices used by us are contigously allocated eg. resultant matrice produce as result of select() function on matrice.

Working with Storages

srg and idx can be created independently of each other. When directly accessing storage elements , storages behave like 1D idx (vectors).

For example:

 	  //creates a storage for double elements with size = 8 & refcount = 0;
	  Srg<double> *srg_m = new Srg(8); 

Creating an idx on a particular storage s can be done with:

	  //Constructor for idx: idx(Srg<T> *srg, intg o, intg n1, intg n2) where o is the offset.
	  idx<double> mat(srg_m, 0, 4, 3); 

This resizes the storage to 12 elements (if it has less than that), and creates an idx of size (4, 3) on this storage. Creating other indxes on the same storage allows to access the same piece of data in multiple ways.

Tensor Manipulation

Several functions are provided to create and manipulate the idx structure and its various fields.

IDX(Constructors):

LIBIDX has large number of constructors provides the flexibility to create completely new idx or idx using characteristics of existing idx such as storage, dim, offset, mod etc;

  • generic constructor
  	      //generic constructor with dims and mods creates the storage and set offset to zero
  	      idx(int n, intg *dims, intg *mods);
  • Specific constructors for each number of dimensions

creates an idx of any order using from 0 to 8 using For example:

    	      //idx(intg s0, intg s1, intg s2, intg s3, intg s4=-1, intg s5=-1, intg s6=-1,intg s7=-1);
    	      idx<int> M(3, 4 , 5, 6, 7) //creates five dimensional tensor of integers.

Observe intg is used for array indexing, hence should be defined as long if you want very large arrays on 64 bit machines(where intg is basically othername for type long: typedef long intg;)

  • Constructors from existing Srg and offset

For example:

   //idx(Srg<T> *srg, intg o, const idxdim &d);
    	      //where srg is existing storage, o is offset
    	      //Third parameter is idxdim, array of dimensions
    	      idxdim d(M); //Create an idxdim based on the information found in an idx M
    	      idx<int> N(srg, 0, d); //Remeber storage is resized if needed
  • Constructors initialized with an array

For example:

     	      //creates an idx2 of size (s0, s1) and fills it with mat, expected to be of size (s0, s1).
    	      //idx(const T *mat, intg s0, intg s1);
    	      float f_arr = new float(10); //array of float of 10
    	      idx M_arr(f_arr, 5, 2); //note: nelements of tensor must be equal to size of array
  • Copy constructor
  idx<float> New(&Old);
Field access functions

Idx contains number of methods to enquire about the state of idx at any - point and data you need to access from existing structure:

For example:

Let idx<ubyte> M(3,4,5); //Creation of three dimensional tensor of type ubyte

      - Srg<ubyte> *s = M.getstorage(); //return pointer to storage
      - M.dim(2) // return size of idx in d-th dimension
      - M.dims() // return an const pointer to array of dims of idx
      - M.mod(1) // return number of elements that separate two successive elements in the dimension
      - M.mods() //return pointer containing array of strides(modulo) of each dimension
      - M.order() //return an order of idx
      - M.offset() //return index of offset of M in current storage
      - intg ne = M.nelements() //return total number of elements
      - intg f = M.footprint() //return total footprint in the storage (index after last cell occupied in the storage)
      - bool flag = M.contiguousp() // return true if elements of idx are contiguous in memory.
      - M.getidxdim(idxdim& d) //copies order and dimensions of this idx into the passed idxdim d
      - bool flag = M.same_dim(idxdim &d) //return true if this idx has same order and dimensions as idxdim d. i.e. if all their dimensions are equal (regardless of strides).
      - bool flag = M.same_dim(3, 2, 5) // return true if this idx has same order and dimensions s0 .. s7
Data access functions

These are methods which are used to get pointer or value to actual data available to idx. For example:

Let idx<double> M(3, 4, 5); //M is three dimensional idx access structure
double *pdata = M.idx_ptr(); //return pointer on data chunk (on first element) i.e after taking care of offset
double *pdElement = M.ptr(1, 2, 3); //return pointer to element with passed value of index
double value = M.get(1, 2, 3); //return the value stored for index taking care of mod & dim itself
M.set(13.5, 1, 2, 3); //set the value at particular index

Tensor Iterator

While tensor elements can be accessed individually via set and get methods, one will mostly need to loop over entire dimensions or entire tensors. For that effect, iterator classes can be used for each tensor to iterate on. However looping macros make it even easier allowing to loop over multiple tensors of any type but of the same size at the same time.

For instance, the idx aloop2 macro loops over all elements of 2 tensors, the idx bloop3 macro loops over the first dimensions of 3 tensors while the idx eloop1 loops over the last dimension of 1 tensor. For each tensor to be iterated, one must specify in order the name of the new lower-ordertensor, the name of the original tensor, and its type.

Thus to compute the sum of multiple tensors we could write:

	  idx<double> td3d(32, 32, 3);
	  idx<int>  ti2d(32, 32);
	  int total = 0;

	  //sum of multiple tensors using bloop Macro
	  idx_bloop2(td2d, td3d, double, ti1d, ti2d, int) {
	  idx_bloop2(td1d, td2d, double, ti0d, ti1d, int) {
	  total += ti0d.get();
	  idx_bloop1(td0d, td1d, double)
	  total += (int) td0d.get(); }}

	  Or simply:

	  //same sum can be calculated using aloop macro as:
	  idx_aloop1(td0d, td3d, double) total += td0d.get();
	  idx_aloop1(ti0d, ti2d, int) total += ti0d.get();

Two kinds of iterator facility provided by LIBIDX:

ScalarIterator

allows to iterate over all elements of an idx.

For Example:

    	      //This can be use to traverse all the elements of idx
    	      idx<double> src0(3, 4, 5);
    	      ScalarIter itr0(src0);

    		//Iterator Declaration
    		for ( ; itr0.notdone(); ++itr0) // For loop declaration		
    		{
    		printf("%g ",*itr0); //body of loop
    		}
    	  

iterator such as itr0 above is very useful to traverse over all the elements of source idx without being taken care is contiguous or non-contiguous. It provides support for very useful operators such as:

  • ++(both post or pre-increment ) operator to get over to the next element
  • - - (both post & pre-decrement operator)
  • * (Dereference operator to get the value of current element pointed by iterator)
  • - > (Pointer dereference operator to get pointer to current element pointed by iterator)

Alternatively, aloopX macro can be used which loops over all elements of an idx. These macros uses other scalar iterator class implemented as subclass inside idx<T> itself and instantiated with dummy iterator. This provides us to all the idx access & manipulation functions.

    	      #define idx_aloop1(itr0,src0,type0)		\
    	      idxiter itr0;				\
    		idx_aloop1_on(itr0,src0)
    	  

Remember: It is advisable to encase every loop in its own scope to prevent name clashes if the same loop variable is use multiple times in a scope.

{ idx_aloop1(itr0, src0, double) // aloopX macro combines declaration of iterator and for loop {printf("%g ",*itr0);}//body of loop }

Other utility of aloopX macros comes when you want to traverse over 2 or more tensors simultaneously. For Example:

    	      //when you want to apply some function on each element of one tensor and store it in other tensor
    	      { idx_aloop2(itr0, src0, double, itr1, src1, double)
    	      { *itr1 = 2 * (*itr0) }
    	      }

    	      //when you want to make sum of elements of two tensors and add to third
    	      { idx_aloop3(itr0, src0, double, itr1, src1, double, itr2, src2, double)
    	      { *itr2 = *itr0 + * itr1; }
    	      }
    	  

For aloopX, X depend on number of matrix you want to pass together or loop together and matrices pass together does not matter, it is really the no. of elements /*At any point during the loop, the indices of the element being worked on is stored in idx.d[k] for k=0 to idx.order()-1. Two things to take care when using Macros are:

  1. Note 1: Remember to include #include <cppunit/extensions/HelperMacros.h>
  2. Note 2: It's advisable to encase every loop in its own scope to prevent name clashes if the same loop variable is used multiple times in a scope.
DimensionIterator

allows to iterate over one of the dimension rather than every element. After each iteration it points to next element in that one particular dimension. It derives from class idx&l;T>, so uses all available manipulation/access functions. It returns sub-tensor after each iteration which is ith slice along the given dimension. For example:

    	      //This is to show the properties of sub-tensor produced as result each iteration
    	      idx<double> src(3, 4);
    	      //It takes two argument: src idx<T> and the index of the dimension being iterated over
    	      DimIter<double> dim_itr(src,1); //Iterator Declaration

    	      for ( ; dim_itr.notdone(); ++dim_itr) // For loop declaration	
    	      {dim_itr.pretty(pFile);} //body of loop using pretty function on produced dimension iterator
    	      // sub tensor is idx in itself and can use all available idx support	
    	      **************************************************************************************************************
    	      Properties of source idx:	 	     Properties of produced Dimension iterator:

    	      idx: at address -1235713720 		idx: at address -1235714276
    	      storage=141662608 (size=12) 	 storage=141662608 (size=12)
    	      idxspec -1235713712			 idxspec -1235714268
    	      ndim=2//source is two dimensional    ndim=1 //sub tensor is always one dimension less than source
    	      offset=0				  offset=0
    	      dim=[ 3, 4]				  dim=[ 3]
    	      mod=[ 4, 1]				  mod=[ 4]
    	      footprint= 12				  footprint= 9
    	      contiguous= yes			  contiguous= no  //mostly it is non-contigous
    	      *************************************************************************************************************************
    	  

Similar to scalar iterator it support all the operator as (post & pre)++ and - -, *(Dereference), - > (Pointer Dereference). Additionally, as it is inherited from idx<T> class itself and being a sub-tensor you can use all available support for idx<T> on dimension iterator as well.

Alternatively, bloopX macro can be used in place of dimension iterator. This is equivalent to explicitly declaring a dimension iterator to loop over dimension 0 i.e first dimension. Similar to aloopX, it can take one or more argument together for simultaneous iteration on more than one idx, given that they have same number of elements in iterating dimension i.e 0 in this case.

For example:

    	      idx_bloop2(dst0,src0,type0,dst1,src1,type1) is equivalent to

    	      idx_checkdim2_all(src0, src1, 0); 	// Check both idx has same no. of elements in first dimension	
    	      DimIter dst0(src0,0);	//Dim-Iter for first			
    		DimIter dst1(src1,0);	//Dim-Iter for second	

    		  for ( ; dst0.notdone(); ++dst0, ++dst1)  //For loop to iterate over each element of first dimension
    	  

A similar function idx-eloop iterates on the last dimension of a tensor instead of the first dimension.

For example , the matrix product operation C = A*B can be written as follows:

    	      //For C=A*B idx-m2timesm2 (A B C), although it is already provided as library function
    	      {
    	      idx_eloop2(Bj, B, float,Cj, C, float)
    	      { idx_m2dotm1(A, Bj, Cj); }
    	      }
    	  

Graphic User Interface

GUI descriptors Eblearn also provides a support to display idx on GUI window or white board which pport idx based I/O. Libidxgui library provides set of classes which work together QT based graphics & libidx to provide the multithreaded gui support. It provides the global variable:

 ebl::idxgui gui; // used like std::cout 

This can be conveniently used as gui stream object and can be used just like std::cout to display images & text. You can simply display RGB & grayscale images with help of as many windows as possible. For example to display a image stored in file to window:

    #include "libidxgui.h" //include the libidxgui library
    //Replace a normal main with macro provided in library
    //Basically, QT takes over the main thread and runs your code in a thread.
    
    #ifdef __GUI__
    MAIN_QTHREAD(int, argc, char **, argv) { // macro to enable multithreaded gui
    #else
    int main(int argc, char **argv) { // regular main without gui
    #endif

    //Declaring ubyte image matrix to read values & specifying directory for image
    idx im(1, 1, 1);
    string imgfile = "/home/user/eblearn/data";
    imgfile += "/pnm/hat_P6.ppm";
    //Reading pnm image bytes from file into 3D idx tensor
    pnm_fread_into_rgbx(imgfile.c_str(), im);
    //Global pointer to gui, allows to call for example  gui.draw_matrix() from anywhere in the code.
    //call for new window, it returns a window id, you can create more than one window unsigned
    int wid = gui.new_window("GUI_test");
    //draw image matrix on window with draw_matrix function gui

debugging Using gdb

  call  ./debug.sh 

One can inspect variables and call most functions in an interative manner. For example, call ./debug.sh at the root of the eblearn directory. This will start a debugging environment in emacs and gdb.

Gdb is configured not to stop all threads when a tread stops (see .gdbinit), so you need to switch to the thread that stopped, e.g.:

     thread 2

To get information about an idx 'a' (order, dimensions, etc), you can call:

    p a.pretty()

If you want to display an idx, call draw_matrix, e.g.:

    p draw_matrix_float(a, 0, 0, 1, 1, 0, 255)

If you need to manipulate the idx, for example if you want to display only 1 layer of the RGB image 'a', you can use global debug idx variables:

    set gdb_idx_float1 = a.select(2, 0) 
    p draw_matrix_float(gdb_idx_float1, 0, 0, 1, 1, 0, 255)

Note: gdb does not accept calls to template functions, one needs to declare instantiations in the code first (e.g. draw_matrix_float instead of draw_matrix). Some of those instantiations are available in the libeblearntools, link with this library to use them.

     p new_window(0, 42,42);
     p clear_window()

Note: if this does not work, the debug symbols might not be available, try a make clean and recompile in debug mode (make debug). Note: if you are outside of the ebl scope, add “ebl::” in front of your function calls. Note: gdb requires all parameters to be specified, even default ones. Note: draw_matrix_float and gdb_idx_float1 are located in the libeblearntools library, you need to link with this library to gain access to those.

libidx.txt · Last modified: 2012/06/15 16:29 by qianli