bbow

closure

bbow is a fast index for similarity retrieval of Binary Bag-Of-Word descriptors.

A useful capability for robots (e.g., for SLAM) is to recognize places using visual data. A common approach to this problem is to use invariant visual descriptors such as SIFT or SURF. However, these have the drawback of having relatively large CPU and storage requirements. It has been shown that simpler binary descriptors, such as BRIEF may provide competitive discriminativity despite being faster to match and having a smaller storage footprint.

Nonetheless, for large-scale visual localization and mapping problems exhaustive descriptor matching is infeasible, even with the comparative efficiency of binary descriptors. Therefore a data structure to enable fast search of similar bag-of-words of binary descriptors is necessary.

A popular approach to solve this problem when using traditional real-valued descriptors such as SIFT is Nister's vocabulary tree, which combines hierarchical K-means quantization and an inverted index scheme. However, K-means with an Euclidean distance metric is not adequate for binary descriptors, as it assumes the data points live in a vector space.

bbow is a simple solution to this problem. It implements hierarchical vocabulary tree index with a representation better suited for binary bag of words descriptors: quantization is performed with K-median clustering using the bitwise Hamming distance between descriptors. The algorithm is implemented in C++ for efficiency.

bbow has been used for large-scale VSLAM research at MIT's Marine Robotics Lab.