Word clustering is an effective approach in the bag- of-words model to reducing the dimensionality of high-dimensional features. In recent years, the bag- of-words model has been successfully introduced into visual recognition and significantly developed. Often, in order to adequately model the complex and diversified visual patterns, a large number of visual words are used, especially in the state-of- the-art visual recognition methods. As a result, the existing word clustering algorithms become not computationally efficient enough. They can considerably prolong the process such as model updating and parameter tuning, where word clustering needs to be repeatedly employed. In this paper, we focus on the divisive information-theoretic clustering, one of the most efficient word clustering algorithms in the field of text analysis, and accelerate its speed to better deal with a large number of visual words. We discuss the properties of its cluster membership evaluation function, KL- divergence, in both binary and multi-class classification cases and develop the accelerated versions in two different ways. Theoretical analysis shows that the proposed accelerated divisive information-theoretic clustering algorithm can handle a large number of visual words in a much more efficient manner. As demonstrated on the benchmark datasets in visual recognition, it can achieve speed-up by hundreds of times while well maintaining the clustering performance of the original algorithm.