HyperRGBD: a framework for aggregation of RGB-D datasets

Although a few relatively large RGB-D datasets are available nowadays, their size is far smaller than that of state-of-the-art RGB datasets. To facilitate experimentation with larger and diverse datasets, the HyperRGBD software framework, introduced in [1],  allows researchers to create straightforwardly new data with desired traits and peculiarities by mixing arbitrarily and seamlessly images drawn from different RGB-D datasets. HyperRGBD supports the main existing RGB-D datasets for category and instance recognition and provides a standardized open interface to allow integration of upcoming ones.


HyperRGBD is a C++ software framework devised to enable researchers and practitioners to build effortlessly new datasets by aggregating images from different existing RGB-D datasets. For example, one might wish to experiment with datasets larger than existing ones, which would seamlessly be attainable by deploying HyperRGBD to aggregate the images belonging to existing datasets into a larger data corpus. Furthermore, should a dataset be biased towards certain abundant categories with others featuring a few samples only, it would be just as seamless to build a more balanced dataset by using HyperRGBD to draw samples for the rare categories from other datasets. Another example may deal with changing the granularity of categories, e.g.  aggregating  "chair", "table" and "couch" into a broader "furniture" category or splitting "fruit" into more specific categories like "apple", "orange" and "banana".

At present, we have integrated in the framework the main existing RGB-D datasets for object recognition, i.e. Washington, CIN 2D+3DBigBIRD and MV-RED. Nonetheless, it is worth pointing out that, indeed, the framework is not confined to the aggregation of RGB-D data only, but it may be exploited to merge any dataset dealing with image recognition.

As described in [1], we exploited HyperRGBD to obtain two new RGB-D datasets used in our experiments besides the main existing ones:

How it works

The integration of an existing dataset into the framework is easily accomplished by instantiating a software component featuring a standard interface, referred to as IDataset. This requires the implementation of a few functions:

The aggregation of datasets into new ones is enabled by the HyperDataset component, that, in turn, implements the interface IDataset so to handle any newly created dataset as seamlessly as any other already integrated one. As every dataset partitions its images in different categories, merging different datasets requires to establish a mapping between the categories of the existing datasets and those of the aggregated one. As an example, categories "coffee_mug", "Cup" and "cup" of the Washington, CIN 2D+3D and MV-RED datasets could be mapped into  category "cups" of the aggregated dataset. To define and realize such mappings, HyperRGBD defines a convenient standard methodology together with the associated software tools. Once the required mappings are established for the existing datasets through the mapCatAssociations map, HyperDataset will automatically perform the aggregation of the datasets listed in vector<IDataset>. Finally, through suitable components, the user can arbitrarily and easily define different criteria for splitting the resulting dataset in training and test set. 


The source code can be downloaded from here. The framework requires the OpenCV library for handling images. Moreover, depth maps and calibration data of BigBIRD dataset are stored in HDF5 format.


[1] A. Petrelli, L. Di Stefano, "Learning to Weight Color And Depth for RGB-D Image Search" Submitted at International Conference on Image Analysis and Processing (2017).

[2] K. Lai, L. Bo, X. Ren, D. Fox, "A large-scale hierarchical multi-view rgb-d object dataset" International Conference on Robotics and Automation 1817–1824 (2011).

[3] B. Browatzki, J. Fischer, "Going into depth: Evaluating 2D and 3D cues for object classification on a new, large-scale object dataset" International Conference on Computer Vision Workshops (2011).

[4] A. Singh, J. Sha, K.S. Narayan, T. Achim, P. Abbeel, "BigBIRD: A large-scale 3D database of object instances" International Conference on Robotics and Automation 509–516 (2014).

[5] A. Liu, Z. Wang, W. Nie, Y. Su, "Graph-based characteristic view set extraction and matching for 3D model retrieval" Information Sciences 320, 429–442 (2015).