Submitted:
24 September 2025
Posted:
25 September 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Background and Literature Review
2.1. Gaia DR3 and Astrometric Jitter
| Feature | Description | Relevance to SMBH Detection |
|---|---|---|
| Astrometric Precision | Microarcsecond-level positional accuracy | Enables detection of subtle jitter from binary SMBHs |
| Time-Series Data | Multi-epoch observations across mission duration | Captures periodic or irregular jitter signatures |
| Sample Size | Over 1.8 billion sources | Expands search for rare binary SMBH candidates |
2.2. Black Hole Quantum Hair and Graviton Echoes
3. Machine Learning for Binary SMBH Detection in Gaia DR3
3.1. Data Preprocessing
- Outliers and missing values, which can bias clustering.
- Systematic calibration errors, particularly in dense stellar regions.
- Noise filtering, using statistical smoothing or wavelet transforms to highlight genuine jitter signatures.
3.2. Unsupervised Clustering Methods
- 1.
- DBSCAN (Density-Based Spatial Clustering of Applications with Noise): Effective for identifying clusters of stars with correlated astrometric deviations while filtering noise.
- 2.
- k-means clustering: Simple partitioning approach, though less effective for irregular cluster shapes.
- 3.
- Hierarchical clustering: Provides multi-scale insights, useful for distinguishing between single-star variability and binary-induced jitter.
3.2. Unsupervised Clustering Methods
- DBSCAN (Density-Based Spatial Clustering of Applications with Noise): Effective for identifying clusters of stars with correlated astrometric deviations while filtering noise.
- k-means clustering: Simple partitioning approach, though less effective for irregular cluster shapes.
- Hierarchical clustering: Provides multi-scale insights, useful for distinguishing between single-star variability and binary-induced jitter.
| ML Method | Strengths | Limitations | Application |
|---|---|---|---|
| DBSCAN | Handles noise; finds odd shapes | Needs careful parameter tuning | Detecting rare candidates |
| k-means | Fast and simple | Struggles with irregular data | Large-scale filtering |
| Hierarchical | Multi-level grouping; visual maps | Slow with very large data | Sub-group analysis |
| Autoencoders | Finds hidden patterns in data | Needs lots of computing power | Feature reduction |
3.4. Case Studies: Potential Candidates in Gaia DR3

4. Machine Learning for Graviton Echo Identification
4.1. Gravitational Wave Datasets
- High-dimensional (time and frequency domains).
- Noisy, affected by seismic, thermal, and instrumental disturbances.
- Sparse in true events, with relatively few confirmed black hole mergers compared to total observation time.
4.2. Machine Learning Approaches
- Supervised Classification: Convolutional neural networks (CNNs) and recurrent neural networks (RNNs) can be trained on simulated echo templates, enabling the classification of signals into “echo” vs. “no-echo” categories.
- Unsupervised Anomaly Detection: When echo templates are uncertain, unsupervised models such as autoencoders or clustering methods help identify unusual patterns not explained by standard ringdown waveforms.
-
Hybrid Approaches: Combining template-matching with ML allows partial reliance on theoretical models while preserving flexibility for unknown signal morphologies.Table 3. Machine Learning Methods for Gravitational Wave Echo Detection
4.3. Benchmarking and Challenges
- Cross-validation with simulated injections of echo signals into real data.
- Blind testing across multiple detectors.
- Interpretability tools (e.g., saliency maps in CNNs) to verify what features drive classifications.

5. Integrating Astrometric and Gravitational Wave Data
5.1. Motivation for Data Fusion
- Complementarity of signals: Astrometric jitter reflects orbital motions at kiloparsec scales, while gravitational waves capture dynamical evolution near merger events.
- Cross-validation: Simultaneous evidence of a candidate from both datasets strengthens confidence in detection.
- Extended parameter space: Integrating datasets allows exploration of SMBH properties such as spin, mass ratios, and potential signatures of quantum hair that cannot be constrained by a single observation method.
5.2. Methodological Approaches
- A.
- Time-series synchronization: Aligning Gaia light curves with gravitational wave strain signals to search for correlated fluctuations.
- B.
- Multi-modal machine learning models: Using deep neural networks capable of learning from both astrometric and gravitational datasets.
- C.
- Joint likelihood frameworks: Statistical approaches combining posterior distributions from Gaia and LIGO/Virgo analyses to improve parameter estimation.
![]() |
6. Case Studies and Applications
6.1. Gaia Binary SMBH Candidates
6.2. Graviton Echo Searches in LIGO/Virgo Data
6.3. Cross-Validation Across Domains

7. Conclusion
References
- Raj, A. (2025). Unsupervised Classification of Binary SMBH Candidates in Gaia DR3: A Machine Learning Approach to Astrometric Jitter and Cluster-Based Candidate Identification. Acceleron Aerospace Journal, 5(1), 1246-1257. [CrossRef]
- Huijse, P., Davelaar, J., De Ridder, J., Jannsen, N., & Aerts, C. (2025). Periodic Variability in Space Photometry of 181 New Supermassive Black Hole Binary Candidates. arXiv preprint arXiv:2505.16884. [CrossRef]
- Witt, C. A., Charisi, M., Taylor, S. R., & Burke-Spolaor, S. (2022). Quasars with periodic variability: capabilities and limitations of Bayesian searches for supermassive black hole binaries in time-domain surveys. The Astrophysical Journal, 936(1), 89. [CrossRef]
- Raj, A. (2025). Graviton Echoes from Quantum Hair: A Theoretical Probe Beyond the Black Hole No-Hair Theorem. [CrossRef]
- Raj, A. (2025). Resolving the Black Hole Information Paradox: A Review of Quantum Extremal Surfaces, Entanglement Islands, and the Page Curve. International Journal of Science and Research (IJSR), 14(4), 10-21275. [CrossRef]
- Nagila, A., & Mishra, A. K. (2024, April). Machine-Learning Methods for Plant Leaf Disease for Improving Agricultural Production. In International Conference on Information and Communication Technology for Intelligent Systems (pp. 289-300). Singapore: Springer Nature Singapore. [CrossRef]
- Nagila, A., Trivedi, N., Nagila, R., Trivedi, K., Bhardwaj, S., & Rani, J. (2025, April). A Framework for Automated Software Testing using Machine Learning and Artificial Intelligence. In 2025 International Conference on Knowledge Engineering and Communication Systems (ICKECS) (pp. 1-7). IEEE.
- Nagila, A., & Mishra, A. K. (2024, June). Detection and categorization of diseases affecting plant leaves with the use of machine learning. In 2024 OPJU International Technology Conference (OTCON) on Smart Computing for Innovation and Advancement in Industry 4.0 (pp. 1-6). IEEE. [CrossRef]
- Nagila, A., Mishra, A. K., Trivedi, N., Nagila, R., Trivedi, K., & Jain, A. (2025, April). Exploring the Effectiveness of Machine Learning Algorithms for Tomato Leaf Disease Classification Using Multiple Image. In 2025 4th OPJU International Technology Conference (OTCON) on Smart Computing for Innovation and Advancement in Industry 5.0 (pp. 1-8). IEEE. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
