Three-dimensional (3D) shape recognition is a fundamental task in computer vision, where view-based methods have recently achieved state-of-the-art performance. However, effectively capturing and exploiting the rich geometric correspondences between different views remains a key challenge, as such information is crucial for accurate shape representation. Existing methods often fall short in explicitly modeling these structured correlations, which limits their ability to fully leverage discriminative shape information. To address this limitation, we propose a novel View-based Graph Convolution and Sampling Fusion Network (View-GFN). View-GFN employs a hierarchical architecture that progressively coarsens the view-graph to learn multi-scale features. In this structure, views are treated as graph nodes, and a predefined-value strategy is introduced to initialize the adjacency matrix (AM) for constructing initial node correlations. For effective graph coarsening, we develop a novel view down-sampling method based on a cluster assignment matrix. Furthermore, a Graph Convolution and Sampling Fusion (CSF) module is designed to seamlessly integrate deep feature embeddings with the topological information derived from view down-sampling. Extensive experiments on benchmark datasets, including ModelNet40 and RGB-D, demonstrate that View-GFN achieves a superior recognition accuracy of 97.8%, surpassing previous methods while reducing the number of model parameters by nearly 50%. These results validate the superiority of our hierarchical fusion strategy in capturing multi-view geometric information both effectively and efficiently.