Visual place recognition is a critical and challenging problem in both robotics and computer vision communities. In this paper, we focus on place recognition for visual Simultaneous Localization and Mapping (vSLAM) systems. These systems have been limited to handcrafted feature based paradigms for a long time, which normally use local visual information of images and are not sufficiently robust against variations applied to images. In this work, we address place recognition with the features automatically learned from data. First, we propose a graph-based visual place recognition method. The graph is constructed by combining the visual features extracted from convolutional neural networks (CNNs) and the temporal information of the images in a sequence. Second, we propose to employ diffusion process to enhance the data association in the graph to achieve more accurate recognition results. Finally, to evaluate the proposed method, we experiment on four commonly used datasets. Experimental results indicate that the proposed method is able to obtain significantly better performance (e.g. 95.37% recall at 100% of precision) than that of FAB-MAP (47.16% recall at 100% of precision), a commonly used method for place recognition based on handcrafted features, especially on some challenging datasets.