Abstract:
Advances in machine learning and image feature representations have led to great progress in pattern recognition approaches in recognising up to 1000 visual object categories. However, the human brain solves this problem effortlessly as it can recognise about 10000 to 100000 objects with a small number of examples. In recent years bag-of- features approach has proved to yield state-of-the-art performance in large scale evaluations. In such systems a visual codebook plays a crucial role. For constructing a codebook researchers cover a large-scale of training image set. But this brings up the issue of scalability. A large volume of training data becomes difficult to process whereas the high dimensional image representation could make many machine learning algorithms become inefficient or even a breakdown. In this work we investigate whether the dominant bag-of-features approach used in object recognition will continue significantly to improve with large training image set or not. We have validated a one-pass clustering algorithm to construct visual codebooks for object classification tasks on the PASCAL VOC Challenge image set. Our testing results show that adding more training images do not contribute significantly to increase the performance of classification but it increases the overall model complexity in terms of increased storage requirement and greater computational time. This study further suggests an alternative view to the community working with the patch-based object recognition to enforce retaining more discriminative descriptors rather than the reminiscent of the BIG data hypothesis