3D Gaussian Splatting (3DGS) degrades severely in sparse-view scenarios, often collapsing into artifacts due to under-constrained optimization. While incorporating monocular depth priors provides dense supervision, their inherent multi-view inconsistency frequently distorts geometry. To address this, we propose GeoTrack-GS, a geometry-first framework that refines noisy depth priors using reliable self-supervised constraints. Specifically, we leverage sparse feature tracks to enforce macro-level reprojection consistency and introduce a micro-level anisotropic regularizer via K-NN PCA to suppress rank-collapse. On this corrected geometry, we design GT-DCA, a geometry-guided deformable cross-attention module that captures view-dependent appearance without compromising structure. A Decoupled Constraint Stabilization strategy further balances these heterogeneous signals during training. Experiments on LLFF and DTU under 3-9 input views, and on Mip-NeRF 360 under 12 input views, demonstrate that GeoTrack-GS achieves state-of-the-art geometric fidelity while maintaining competitive rendering quality compared to existing baselines, effectively reducing floaters and "waxy" surfaces.