To address the problems of strong noise, high asynchrony, pronounced subjectivity in risk labels, and insufficient model stability under extreme market conditions in multi-source risk signals within trading environments, a low-noise investment risk prediction method based on multimodal sensing signals and self-supervised representation learning is proposed. Market quotations, order books, terminal interactions, network transmission, device status, and news sentiment are uniformly modeled as risk perception signals. A temporal masking-based risk structure modeling module, a risk-oriented contrastive learning representation constraint mechanism, and a risk representation and downstream prediction task alignment strategy are designed to learn stable, transferable, and interpretable risk features. Experimental results show that the proposed method achieves the best performance in investment risk prediction, with mean squared error (MSE), mean absolute error (MAE), and root mean square error (RMSE) reaching 0.0164, 0.0851, and 0.1281, respectively, outperforming baseline models including generalized autoregressive conditional heteroskedast (GARCH), multi-layer perceptron (MLP), long short term memory (LSTM), temporal convolutional networks (TCN), and Transformer. The IC, RankIC, and AUC reach 0.496, 0.462, and 0.817, respectively, indicating stronger risk ranking capability and improved discrimination between high-risk and low-risk states. At the classification recognition level, the proposed method also demonstrates superior accuracy, precision, recall, and F1-score, indicating that potential high-risk assets can be identified more accurately. Ablation experiments verify the effectiveness of multimodal fusion, temporal masking, self-supervised contrastive constraints, and task alignment modules. Robustness experiments further show that lower prediction errors and higher AUC can still be maintained in high-volatility and extreme-shock markets, demonstrating strong noise resistance, stability, and practical application potential in complex sensing scenarios.