The idx library provides support for tensor description and manipulation. It is self-contained and the foundation for the eblearn learning library. In addition to generic tensor operations, it provides a set of operations specific to images, which are described as 3-dimensional tensors.
The idx library (libidx) provides tensor (multi-dimensional arrays) descriptors and manipulators, basis to the eblearn library. This core library along with libeblearn is self-contained and templated, allowing compilation for any type and on any platform. Headers (.h) and templates (.hpp) files are located in core/libidx/include, source files (.cpp) in core/libidx/src.
There are five main components to the library:
The idx class can represent tensors up to 8 dimensions. It is templated in order to be compiled to work with any data type. For a floating point 8 by 4 matrix can be created as follow:
idx<float> m(8, 4); // create a 2-dimensional float matrix
idx objects only describe tensors, they do not actually contain data. They can therefore be copied at low computation cost. Multiple tensors with different orders and dimensions can point to the same data, e.g.:
idx<float> m2 = m; // same size and pointing to the same data as m idx<float> m3 = m.select(0,0); // a 1D slice of m pointing to the same data
In the above examples, m, m2 and m3 all point to the same data storage. A storage keeps a reference counter so that even if m, the idx which created it is deleted, it stays allocated until all refering idx (m2 and m3) are deleted.
One can assign or retrieve the value of a tensor element via set() and get() methods:
m.set(34.7, 7, 1); // set value of element (7,1) to 34.7 float x = m.get(7, 1); // get value of element (7, 1)
Shallow idx operations allow for example to slice or resize an existing idx, only modifying its dimensions or offsets. They are shallow in the sense that the pointed data is never modified and remain therefore efficient to compute. They merely derive new idx objects from existing ones to access the data in new ways.
idx<float> m(10, 4); idx<float> slice = m.select(dimension, index);
For example: Suppose a rgb image, I of dimension 75×100. Effectively, the dimension of matrix containing color information for image is 75x100x3, where first two dimension represent index of pixel and three is for r, g, & b for each pixel. Now, to only select information about green color for every pixel, you can select the 2nd slice in last dimension, resulting in a 75×100 matrix:
idx<ubyte> I(75, 100, 3); // color image idx<ubyte> slice_green = I.select(2, 1) // slice[1] of last dimension[2]
So, the user can narrow a dimension d to size s starting at position p to create a 3-dimensional subset of the 3D tensor t:
idx <double> subset3d = t.narrow(dimension, size, offset); idx<ubyte> I(75, 100, 3); // color image idx<ubyte> bottom_half = I.narrow(0, 37, 38); // bottom half of image
For example, to move the channels dimensions of an image to the first dimension, one can call:
int p[] = {2, 0, 1} idx<ubyte> image_transpose = I.transpose(p);
Or simply:
idx<ubyte> A = I.shift_dim(2, 0);
Suppose V is vector the [0 1 2 3 4 5 6 7],
V.unfold(0, 3, 1); // unfold(n, size, step)
will produce the following “unfolded” matrix:
[[ 0.00 1.00 2.00 ] [ 1.00 2.00 3.00 ] [ 2.00 3.00 4.00 ] [ 3.00 4.00 5.00 ] [ 4.00 5.00 6.00 ] [ 5.00 6.00 7.00 ]]
No data is copied or replicated in the process , i.e. the three occurences of the number 2 in the above example actually come from a single memory location. The size of the new dimension is ksize . If dimn is the size of the n -th dimension of m , the size of the returned idx in the n -th dimension is (dimn-ksize)/step + 1 . The values of dimn, ksize, and step must be such that the new size is an integer.
This essentially manipulates the mod array to make convolutions look like matrix-vector multiplies.The unfold function essentially allows to reduce a discrete convolution to a matrix-vector product. Here is an example:
vector M = [ 1 1 0 2 3 4 2 0 ] and vector kernel = [-1 2 -1] idx U = M.unfold(0 3 1); idx_m2dotm1( U kernel output); So, output = [ 1.00 -3.00 1.00 0.00 3.00 0.00 ] A subsampled convolution can be implemented by unfolding with a step larger than 1: Let m = [0 1 2 3 4 5 6 7 8] & kernel = [1 2 1]; idx<double> z = m.unfold(0, 3, 2); = [[ 0.00 1.00 2.00 ] [ 2.00 3.00 4.00 ] [ 4.00 5.00 6.00 ] [ 6.00 7.00 8.00 ]] idx-m2dotm1( z, kernel, output);
Finally, output = [ 4.00 12.00 20.00 28.00 ]
Naturally , there is no real need for most programmers to use the unfold construct because the idx library contains efficient 1D and 2D convolutions.
idx<float> m(8, 4); // create a 2-dimensional float matrix m.set(34.7, 7, 1); // set value of element (7,1) to 34.7 float x = m.get(7, 1); // get value of element (7, 1)
All of these functions to create matrices take 0 to 8 integer argument that are the sizes in each dimension. LIBIDX provides variety of operation from basic advance to cover all commonly used linear algebra operations to operate on matrix. It also provides large variety of iterators to loop over for slices/elements of matrix.
An idx is merely an access structure that points to the data. Several idx can point to the same data. Underlying implementation used for implementing tensor really composed of two separate entities:
1. storage or srg: which contains the following fields: a pointer to the actual data , an element type identifier (and the size thereof). 2. index or idx points: to a srg and contains the following fields: the number of dimensions of the tensor , the size each dimension , the address increment from one element to the next in each dimension (called modulo ) , and the offset of the first tensor element in the storage . This structure allows multiple idx to point to the same srg thereby allowingthe data to be accessed in multiple ways without copying.
For example metadata stored for matrice M(10, 8, 4) in memory is:
idx: at address -1234116112 storage=-1243605552 (size=320) idxspec -1234116104 ndim=3 offset=0 dim=[ 10, 8, 4] mod=[ 32, 4, 1] footprint= 320 contiguous= yes
This is basic information which is maintained/updated for every idx(tensor) at any point of time.
Basic information which is maintained/updated for every idx(tensor) at any point of time.
Important charatrerstics/values are:
srg and idx can be created independently of each other. When directly accessing storage elements , storages behave like 1D idx (vectors).
For example:
//creates a storage for double elements with size = 8 & refcount = 0; Srg<double> *srg_m = new Srg(8);
Creating an idx on a particular storage s can be done with:
//Constructor for idx: idx(Srg<T> *srg, intg o, intg n1, intg n2) where o is the offset. idx<double> mat(srg_m, 0, 4, 3);
This resizes the storage to 12 elements (if it has less than that), and creates an idx of size (4, 3) on this storage. Creating other indxes on the same storage allows to access the same piece of data in multiple ways.
Several functions are provided to create and manipulate the idx structure and its various fields.
LIBIDX has large number of constructors provides the flexibility to create completely new idx or idx using characteristics of existing idx such as storage, dim, offset, mod etc;
//generic constructor with dims and mods creates the storage and set offset to zero idx(int n, intg *dims, intg *mods);
creates an idx of any order using from 0 to 8 using For example:
//idx(intg s0, intg s1, intg s2, intg s3, intg s4=-1, intg s5=-1, intg s6=-1,intg s7=-1); idx<int> M(3, 4 , 5, 6, 7) //creates five dimensional tensor of integers.
Observe intg is used for array indexing, hence should be defined as long if you want very large arrays on 64 bit machines(where intg is basically othername for type long: typedef long intg;)
For example:
//idx(Srg<T> *srg, intg o, const idxdim &d); //where srg is existing storage, o is offset //Third parameter is idxdim, array of dimensions idxdim d(M); //Create an idxdim based on the information found in an idx M idx<int> N(srg, 0, d); //Remeber storage is resized if needed
For example:
//creates an idx2 of size (s0, s1) and fills it with mat, expected to be of size (s0, s1). //idx(const T *mat, intg s0, intg s1); float f_arr = new float(10); //array of float of 10 idx M_arr(f_arr, 5, 2); //note: nelements of tensor must be equal to size of array
idx<float> New(&Old);
Idx contains number of methods to enquire about the state of idx at any - point and data you need to access from existing structure:
For example:
Let idx<ubyte> M(3,4,5); //Creation of three dimensional tensor of type ubyte - Srg<ubyte> *s = M.getstorage(); //return pointer to storage - M.dim(2) // return size of idx in d-th dimension - M.dims() // return an const pointer to array of dims of idx - M.mod(1) // return number of elements that separate two successive elements in the dimension - M.mods() //return pointer containing array of strides(modulo) of each dimension - M.order() //return an order of idx - M.offset() //return index of offset of M in current storage - intg ne = M.nelements() //return total number of elements - intg f = M.footprint() //return total footprint in the storage (index after last cell occupied in the storage) - bool flag = M.contiguousp() // return true if elements of idx are contiguous in memory. - M.getidxdim(idxdim& d) //copies order and dimensions of this idx into the passed idxdim d - bool flag = M.same_dim(idxdim &d) //return true if this idx has same order and dimensions as idxdim d. i.e. if all their dimensions are equal (regardless of strides). - bool flag = M.same_dim(3, 2, 5) // return true if this idx has same order and dimensions s0 .. s7
These are methods which are used to get pointer or value to actual data available to idx. For example:
Let idx<double> M(3, 4, 5); //M is three dimensional idx access structure double *pdata = M.idx_ptr(); //return pointer on data chunk (on first element) i.e after taking care of offset double *pdElement = M.ptr(1, 2, 3); //return pointer to element with passed value of index double value = M.get(1, 2, 3); //return the value stored for index taking care of mod & dim itself M.set(13.5, 1, 2, 3); //set the value at particular index
While tensor elements can be accessed individually via set and get methods, one will mostly need to loop over entire dimensions or entire tensors. For that effect, iterator classes can be used for each tensor to iterate on. However looping macros make it even easier allowing to loop over multiple tensors of any type but of the same size at the same time.
For instance, the idx aloop2 macro loops over all elements of 2 tensors, the idx bloop3 macro loops over the first dimensions of 3 tensors while the idx eloop1 loops over the last dimension of 1 tensor. For each tensor to be iterated, one must specify in order the name of the new lower-ordertensor, the name of the original tensor, and its type.
Thus to compute the sum of multiple tensors we could write:
idx<double> td3d(32, 32, 3); idx<int> ti2d(32, 32); int total = 0; //sum of multiple tensors using bloop Macro idx_bloop2(td2d, td3d, double, ti1d, ti2d, int) { idx_bloop2(td1d, td2d, double, ti0d, ti1d, int) { total += ti0d.get(); idx_bloop1(td0d, td1d, double) total += (int) td0d.get(); }} Or simply: //same sum can be calculated using aloop macro as: idx_aloop1(td0d, td3d, double) total += td0d.get(); idx_aloop1(ti0d, ti2d, int) total += ti0d.get();
Two kinds of iterator facility provided by LIBIDX:
allows to iterate over all elements of an idx.
For Example:
//This can be use to traverse all the elements of idx idx<double> src0(3, 4, 5); ScalarIter itr0(src0); //Iterator Declaration for ( ; itr0.notdone(); ++itr0) // For loop declaration { printf("%g ",*itr0); //body of loop }
iterator such as itr0 above is very useful to traverse over all the elements of source idx without being taken care is contiguous or non-contiguous. It provides support for very useful operators such as:
Alternatively, aloopX macro can be used which loops over all elements of an idx. These macros uses other scalar iterator class implemented as subclass inside idx<T> itself and instantiated with dummy iterator. This provides us to all the idx access & manipulation functions.
#define idx_aloop1(itr0,src0,type0) \ idxiter itr0; \ idx_aloop1_on(itr0,src0)
Remember: It is advisable to encase every loop in its own scope to prevent name clashes if the same loop variable is use multiple times in a scope.
{ idx_aloop1(itr0, src0, double) // aloopX macro combines declaration of iterator and for loop {printf("%g ",*itr0);}//body of loop }
Other utility of aloopX macros comes when you want to traverse over 2 or more tensors simultaneously. For Example:
//when you want to apply some function on each element of one tensor and store it in other tensor { idx_aloop2(itr0, src0, double, itr1, src1, double) { *itr1 = 2 * (*itr0) } } //when you want to make sum of elements of two tensors and add to third { idx_aloop3(itr0, src0, double, itr1, src1, double, itr2, src2, double) { *itr2 = *itr0 + * itr1; } }
For aloopX, X depend on number of matrix you want to pass together or loop together and matrices pass together does not matter, it is really the no. of elements /*At any point during the loop, the indices of the element being worked on is stored in idx.d[k] for k=0 to idx.order()-1. Two things to take care when using Macros are:
allows to iterate over one of the dimension rather than every element. After each iteration it points to next element in that one particular dimension. It derives from class idx&l;T>, so uses all available manipulation/access functions. It returns sub-tensor after each iteration which is ith slice along the given dimension. For example:
//This is to show the properties of sub-tensor produced as result each iteration idx<double> src(3, 4); //It takes two argument: src idx<T> and the index of the dimension being iterated over DimIter<double> dim_itr(src,1); //Iterator Declaration for ( ; dim_itr.notdone(); ++dim_itr) // For loop declaration {dim_itr.pretty(pFile);} //body of loop using pretty function on produced dimension iterator // sub tensor is idx in itself and can use all available idx support ************************************************************************************************************** Properties of source idx: Properties of produced Dimension iterator: idx: at address -1235713720 idx: at address -1235714276 storage=141662608 (size=12) storage=141662608 (size=12) idxspec -1235713712 idxspec -1235714268 ndim=2//source is two dimensional ndim=1 //sub tensor is always one dimension less than source offset=0 offset=0 dim=[ 3, 4] dim=[ 3] mod=[ 4, 1] mod=[ 4] footprint= 12 footprint= 9 contiguous= yes contiguous= no //mostly it is non-contigous *************************************************************************************************************************
Similar to scalar iterator it support all the operator as (post & pre)++ and - -, *(Dereference), - > (Pointer Dereference). Additionally, as it is inherited from idx<T> class itself and being a sub-tensor you can use all available support for idx<T> on dimension iterator as well.
Alternatively, bloopX macro can be used in place of dimension iterator. This is equivalent to explicitly declaring a dimension iterator to loop over dimension 0 i.e first dimension. Similar to aloopX, it can take one or more argument together for simultaneous iteration on more than one idx, given that they have same number of elements in iterating dimension i.e 0 in this case.
For example:
idx_bloop2(dst0,src0,type0,dst1,src1,type1) is equivalent to idx_checkdim2_all(src0, src1, 0); // Check both idx has same no. of elements in first dimension DimIter dst0(src0,0); //Dim-Iter for first DimIter dst1(src1,0); //Dim-Iter for second for ( ; dst0.notdone(); ++dst0, ++dst1) //For loop to iterate over each element of first dimension
A similar function idx-eloop iterates on the last dimension of a tensor instead of the first dimension.
For example , the matrix product operation C = A*B can be written as follows:
//For C=A*B idx-m2timesm2 (A B C), although it is already provided as library function { idx_eloop2(Bj, B, float,Cj, C, float) { idx_m2dotm1(A, Bj, Cj); } }
GUI descriptors Eblearn also provides a support to display idx on GUI window or white board which pport idx based I/O. Libidxgui library provides set of classes which work together QT based graphics & libidx to provide the multithreaded gui support. It provides the global variable:
ebl::idxgui gui; // used like std::cout
This can be conveniently used as gui stream object and can be used just like std::cout to display images & text. You can simply display RGB & grayscale images with help of as many windows as possible. For example to display a image stored in file to window:
#include "libidxgui.h" //include the libidxgui library //Replace a normal main with macro provided in library //Basically, QT takes over the main thread and runs your code in a thread. #ifdef __GUI__ MAIN_QTHREAD(int, argc, char **, argv) { // macro to enable multithreaded gui #else int main(int argc, char **argv) { // regular main without gui #endif //Declaring ubyte image matrix to read values & specifying directory for image idx im(1, 1, 1); string imgfile = "/home/user/eblearn/data"; imgfile += "/pnm/hat_P6.ppm"; //Reading pnm image bytes from file into 3D idx tensor pnm_fread_into_rgbx(imgfile.c_str(), im); //Global pointer to gui, allows to call for example gui.draw_matrix() from anywhere in the code. //call for new window, it returns a window id, you can create more than one window unsigned int wid = gui.new_window("GUI_test"); //draw image matrix on window with draw_matrix function gui
call ./debug.sh
One can inspect variables and call most functions in an interative manner. For example, call ./debug.sh at the root of the eblearn directory. This will start a debugging environment in emacs and gdb.
Gdb is configured not to stop all threads when a tread stops (see .gdbinit), so you need to switch to the thread that stopped, e.g.:
thread 2
To get information about an idx 'a' (order, dimensions, etc), you can call:
p a.pretty()
If you want to display an idx, call draw_matrix, e.g.:
p draw_matrix_float(a, 0, 0, 1, 1, 0, 255)
If you need to manipulate the idx, for example if you want to display only 1 layer of the RGB image 'a', you can use global debug idx variables:
set gdb_idx_float1 = a.select(2, 0) p draw_matrix_float(gdb_idx_float1, 0, 0, 1, 1, 0, 255)
Note: gdb does not accept calls to template functions, one needs to declare instantiations in the code first (e.g. draw_matrix_float instead of draw_matrix). Some of those instantiations are available in the libeblearntools, link with this library to use them.
p new_window(0, 42,42); p clear_window()
Note: if this does not work, the debug symbols might not be available, try a make clean and recompile in debug mode (make debug). Note: if you are outside of the ebl scope, add “ebl::” in front of your function calls. Note: gdb requires all parameters to be specified, even default ones. Note: draw_matrix_float and gdb_idx_float1 are located in the libeblearntools library, you need to link with this library to gain access to those.