Submitted:
24 June 2025
Posted:
10 July 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Dimensionality Reduction
2.1. Principal Component Analysis (PCA) and Kernal PCA
2.2. Multi-dimensional Scaling and Isometric Feature Mapping
2.3. Locally Linear Embedding
2.4. t-distributed Stochastic Neighbor Embedding
2.5. Examples of Applications of Different Dimensionality Reduction Algorithms
3. Clustering
3.1. Gaussian Mixture Model
3.2. K-means
3.3. Hierarchical Clustering (HC)
3.4. Density-Based Spatial Clustering of Applications with Noise
3.5. Hierarchical Density-Based Spatial Clustering of Applications with Noise
3.6. Fuzzy C-means Clustering
3.7. Examples of Applications of Different Clustering Algorithms
4. Neural Network
4.1. Self-Organizing Map
4.2. Auto-Encoder and Variational Auto-Encoder
5. Other Applications of Unsupervised Machine Learning
5.1. Anomaly Detection
5.2. Symbolic Regression
6. Conclusion
Author Contributions
Funding
Conflicts of Interest
Abbreviations
| ML | Machine learning |
| PCA | Principal component analysis |
| SVD | Singular value decomposition |
| MDS | Multi-dimensional scaling |
| Isomap | Isometric feature mapping |
| LLE | Locally linear embedding |
| t-SNE | t-distributed stochastic neighbor embedding |
| GMM | Gaussian mixture mode |
| EM | Expectation-maximization |
| BIC | Bayesian information criterion |
| VBGMM | Variational Bayesian Gaussian mixture model |
| DPGMM | Dirichlet process Gaussian mixture model |
| HC | Hierarchical clustering |
| DBSCAN | Density-based spatial clustering of applications with noise |
| HDBSCAN | Hierarchical density-based spatial clustering of applications with noise |
| FCC | Fuzzy C-means clustering |
| SOM | Self-organizing map |
| AE | Auto-encoder |
| GPU | Graphics processing unit |
| VAE | Variational auto-encoder |
| iForest | Isolation Forest |
| SR | Symbolic regression |
References
- Almeida, J.S.; Aguerri, J.A.L.; Muñoz-Tuñón, C.; de Vicente, A. AUTOMATIC UNSUPERVISED CLASSIFICATION OF ALL SLOAN DIGITAL SKY SURVEY DATA RELEASE 7 GALAXY SPECTRA. The Astrophysical Journal 2010, 714, 487–504. [Google Scholar] [CrossRef]
- Boersma, C.; Bregman, J.; Allamandola, L.J. PROPERTIES OF POLYCYCLIC AROMATIC HYDROCARBONS IN THE NORTHWEST PHOTON DOMINATED REGION OF NGC 7023. II. TRADITIONAL PAH ANALYSIS USING k-MEANS AS A VISUALIZATION TOOL. The Astrophysical Journal 2014, 795, 110. [Google Scholar] [CrossRef]
- Panos, B.; Kleint, L.; Huwyler, C.; Krucker, S.; Melchior, M.; Ullmann, D.; Voloshynovskiy, S. Identifying Typical Mg ii Flare Spectra Using Machine Learning. The Astrophysical Journal 2018, 861, 62. [Google Scholar] [CrossRef]
- Rodrigo, C.; Cruz, P.; Aguilar, J.F.; Aller, A.; Solano, E.; Gálvez-Ortiz, M.C.; Jiménez-Esteban, F.; Mas-Buitrago, P.; Bayo, A.; Cortés-Contreras, M.; et al. Photometric segregation of dwarf and giant FGK stars using the SVO Filter Profile Service and photometric tools. Astronomy & Astrophysics 2024, 689, A93. [Google Scholar] [CrossRef]
- Zhang, H.; Ardern-Arentsen, A.; Belokurov, V. On the existence of a very metal-poor disc in the Milky Way, 2024, [arXiv:astro-ph.GA/2311.09294].
- Chattopadhyay, T.; Misra, R.; Chattopadhyay, A.K.; Naskar, M. Statistical Evidence for Three Classes of Gamma-Ray Bursts. The Astrophysical Journal 2007, 667, 1017–1023. [Google Scholar] [CrossRef]
- Matijevič, G.; Prša, A.; Orosz, J.A.; Welsh, W.F.; Bloemen, S.; Barclay, T. Kepler Eclipsing Binary Stars. III. Classification of Kepler Eclipsing Binary Light Curves with Locally Linear Embedding. The Astronomical Journal 2012, 143, 123, [arXiv:astro-ph.SR/1204.2113]. [Google Scholar] [CrossRef]
- Steinhardt, C.L.; Mann, W.J.; Rusakov, V.; Jespersen, C.K. Classification of BATSE, Swift, and Fermi Gamma-Ray Bursts from Prompt Emission Alone. The Astrophysical Journal 2023, 945, 67, [arXiv:astro-ph.HE/2301.00820]. [Google Scholar] [CrossRef]
- Froebrich, D.; Campbell-White, J.; Scholz, A.; Eislöffel, J.; Zegmott, T.; Billington, S.J.; Donohoe, J.; Makin, S.V.; Hibbert, R.; Newport, R.J.; et al. A survey for variable young stars with small telescopes: First results from HOYS-CAPS. Monthly Notices of the Royal Astronomical Society 2018, 478, 5091–5103, [arXiv:astro-ph.GA/1804.09128]. [Google Scholar] [CrossRef]
- Paraficz, D.; Courbin, F.; Tramacere, A.; Joseph, R.; Metcalf, R.B.; Kneib, J.P.; Dubath, P.; Droz, D.; Filleul, F.; Ringeisen, D.; et al. The PCA Lens-Finder: application to CFHTLS. Astronomy & Astrophysics 2016, 592, A75. [Google Scholar] [CrossRef]
- Mesa, D.; Gratton, R.; Zurlo, A.; Vigan, A.; Claudi, R.U.; Alberi, M.; Antichi, J.; Baruffolo, A.; Beuzit, J.L.; Boccaletti, A.; et al. Performance of the VLT Planet Finder SPHERE. II. Data analysis and results for IFS in laboratory. Astronomy & Astrophysics 2015, 576, A121, [arXiv:astro-ph.IM/1503.02486]. [Google Scholar] [CrossRef]
- Banda, J.M.; Angryk, R.A.; Martens, P.C.H. Steps Toward a Large-Scale Solar Image Data Analysis to Differentiate Solar Phenomena. Solar Physics 2013, 288, 435–462. [Google Scholar] [CrossRef]
- Koza, J.R.; Bennett, F.H.; Andre, D.; Keane, M.A. Automated Design of Both the Topology and Sizing of Analog Electrical Circuits Using Genetic Programming. In Artificial Intelligence in Design ’96; Kluwer Academic Publishers: Dordrecht, Netherlands, 1996; pp. 151–170. [Google Scholar] [CrossRef]
- Baron, D. Machine Learning in Astronomy: a practical overview. arXiv e-prints 2019, p. arXiv:1904.07248, [arXiv:astro-ph.IM/1904.07248]. [CrossRef]
- Ivezić, Z.; Connolly, A.; Vanderplas, J.T.; Gray, A. Statistics, Data Mining, and Machine Learning in Astronomy: A Practical Python Guide for the Analysis of Survey Data; Princeton University Press, 2020.
- Fotopoulou, S. A review of unsupervised learning in astronomy. Astronomy and Computing 2024, 48, 100851. [Google Scholar] [CrossRef]
- Pearson, K. LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin philosophical magazine and journal of science 1901, 2, 559–572. [Google Scholar] [CrossRef]
- Hotelling, H. Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology 1933, 24, 417–441, 498–520. [Google Scholar] [CrossRef]
- Hotelling, H. RELATIONS BETWEEN TWO SETS OF VARIATES*. Biometrika 1936, 28, 321–377, [https://academic.oup.com/biomet/article-pdf/28/3-4/321/586830/28-3-4-321.pdf]. [Google Scholar] [CrossRef]
- Follette, K.B. An Introduction to High Contrast Differential Imaging of Exoplanets and Disks, 2023. [arXiv:astro-ph.IM/2308.01354].
- Çakir, U..; Buck, T.. MEGS: Morphological Evaluation of Galactic Structure - Principal component analysis as a galaxy morphology model. Astronomy & Astrophysics 2024, 691, A320. [CrossRef]
- Scholkopf, B.; Smola, A.; Müller, K.R. Nonlinear Component Analysis as a Kernel Eigenvalue Problem. Neural Computation 1998, 10, 1299–1319. [Google Scholar] [CrossRef]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 2011, 12, 2825–2830. [Google Scholar]
- Dale, D.A.; de Paz, A.G.; Gordon, K.D.; Hanson, H.M.; Armus, L.; Bendo, G.J.; Bianchi, L.; Block, M.; Boissier, S.; Boselli, A.; et al. An Ultraviolet-to-Radio Broadband Spectral Atlas of Nearby Galaxies. The Astrophysical Journal 2007, 655, 863–884. [Google Scholar] [CrossRef]
- Ellingson, E.; Lin, H.; Yee, H.K.C.; Carlberg, R.G. The Evolution of Population Gradients in Galaxy Clusters: The Butcher-Oemler Effect and Cluster Infall. The Astrophysical Journal 2001, 547, 609–622, [arXiv:astroph/astro-ph/0010141]. [Google Scholar] [CrossRef]
- Francis, P.J.; Hewett, P.C.; Foltz, C.B.; Chaffee, F.H. An Objective Classification Scheme for QSO Spectra. The Astrophysical Journal 1992, 398, 476. [Google Scholar] [CrossRef]
- Osmer, P.S.; Porter, A.C.; Green, R.F. Luminosity Effects and the Emission-Line Properties of Quasars with 0 < Z < 3.8. The Astrophysical Journal 1994, 436, 678. [Google Scholar] [CrossRef]
- Brotherton, M.S.; Wills, B.J.; Francis, P.J.; Steidel, C.C. The Intermediate Line Region of QSOs. The Astrophysical Journal 1994, 430, 495. [Google Scholar] [CrossRef]
- Cowan, N.B.; Agol, E.; Meadows, V.S.; Robinson, T.; Livengood, T.A.; Deming, D.; Lisse, C.M.; A’Hearn, M.F.; Wellnitz, D.D.; Seager, S.; et al. Alien Maps of an Ocean-bearing World. The Astrophysical Journal 2009, 700, 915–923, [arXiv:astro-ph.EP/0905.3742]. [Google Scholar] [CrossRef]
- Whitmore, B.C. An objective classification system for spiral galaxies. I. The two dominant dimensions. The Astrophysical Journal 1984, 278, 61–80. [Google Scholar] [CrossRef]
- Borg, I.; Groenen, P.J.F. Modern Multidimensional Scaling - Theory and Applications; Springer, 2005. [CrossRef]
- Genest, C.; Nešlehová, J.G.; Ramsay, J.O. A Conversation with James O. Ramsay. International Statistical Review / Revue Internationale de Statistique 2014, 82, 161–183. [Google Scholar] [CrossRef]
- Kruskal, J.B. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 1964, 29, 1–27. [Google Scholar] [CrossRef]
- Kruskal, J.B. Nonmetric multidimensional scaling: A numerical method. Psychometrika 1964, 29, 115–129. [Google Scholar] [CrossRef]
- Tenenbaum, J.B.; de Silva, V.; Langford, J.C. A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science 2000, 290, 2319–2323, [https://www.science.org/doi/pdf/10.1126/science.290.5500.2319]. [Google Scholar] [CrossRef] [PubMed]
- Bu, Y.; Chen, F.; Pan, J. Stellar spectral subclasses classification based on Isomap and SVM. New Astronomy 2014, 28, 35–43. [Google Scholar] [CrossRef]
- Floyd, R.W. Algorithm 97: Shortest path. Commun. ACM 1962, 5, 345. [Google Scholar] [CrossRef]
- Fredman, M.L.; Tarjan, R.E. Fibonacci heaps and their uses in improved network optimization algorithms. J. ACM 1987, 34, 596–615. [Google Scholar] [CrossRef]
- Ward, J.L.; Lumsden, S.L. Locally linear embedding: dimension reduction of massive protostellar spectra. Monthly Notices of the Royal Astronomical Society 2016, 461, 2250–2256, [arXiv:astro-ph.IM/1606.06915]. [Google Scholar] [CrossRef]
- Thorsen, T.; Zhou, J.; Wu, Y. Comparison of Stellar Classification Accuracies Using Automated Algorithms. In Proceedings of the American Astronomical Society Meeting Abstracts #227, January 2016, Vol. 227, American Astronomical Society Meeting Abstracts, p. 348.18.
- Pearson, W.J.; Rodriguez-Gomez, V.; Kruk, S.; Margalef-Bentabol, B. Determining the time before or after a galaxy merger event. Astronomy & Astrophysics 2024, 687, A45, [arXiv:astro-ph.GA/2404.11166]. [Google Scholar] [CrossRef]
- Roweis, S.T.; Saul, L.K. Nonlinear Dimensionality Reduction by Locally Linear Embedding. Science 2000, 290, 2323–2326, [https://www.science.org/doi/pdf/10.1126/science.290.5500.2323]. [Google Scholar] [CrossRef] [PubMed]
- Lehoucq, R.B.; Sorensen, D.C.; Yang, C. ARPACK Users’ Guide; Society for Industrial and Applied Mathematics, 1998; [https://epubs.siam.org/doi/pdf/10.1137/1.9780898719628]. [CrossRef]
- Vanderplas, J.; Connolly, A. Reducing the Dimensionality of Data: Locally Linear Embedding of Sloan Galaxy Spectra. The Astronomical Journal 2009, 138, 1365–1379, [arXiv:astro-ph.IM/0907.2238]. [Google Scholar] [CrossRef]
- Bu, Y.; Zhao, G.; Luo, A.l.; Pan, J.; Chen, Y. Restricted Boltzmann machine: a non-linear substitute for PCA in spectral processing. Astronomy & Astrophysics 2015, 576, A96. [Google Scholar] [CrossRef]
- Kao, W.B.; Zhang, Y.; Wu, X.B. Efficient identification of broad absorption line quasars using dimensionality reduction and machine learning. Publications of the Astronomical Society of Japan 2024, 76, 653–665, [arXiv:astro-ph.GA/2404.12270]. [Google Scholar] [CrossRef]
- Matijevič, G.; Zwitter, T.; Bienaymé, O.; Bland-Hawthorn, J.; Boeche, C.; Freeman, K.C.; Gibson, B.K.; Gilmore, G.; Grebel, E.K.; Helmi, A.; et al. Exploring the Morphology of RAVE Stellar Spectra. The Astrophysical Journal Supplement Series 2012, 200, 14, [arXiv:astro-ph.SR/1204.6502]. [Google Scholar] [CrossRef]
- Daniel, S.F.; Connolly, A.; Schneider, J.; Vanderplas, J.; Xiong, L. Classification of Stellar Spectra with Local Linear Embedding. The Astronomical Journal 2011, 142, 203. [Google Scholar] [CrossRef]
- Yang, M.; Zhang, H.; Wang, S.; Zhou, J.L.; Zhou, X.; Wang, L.; Wang, L.; Wittenmyer, R.A.; Liu, H.G.; Meng, Z.; et al. Eclipsing Binaries From the CSTAR Project at Dome A, Antarctica. The Astrophysical Journal Supplement Series 2015, 217, 28, [arXiv:astro-ph.SR/1504.05281]. [Google Scholar] [CrossRef]
- Hinton, G.E.; Roweis, S. Stochastic Neighbor Embedding. In Proceedings of the Advances in Neural Information Processing Systems; Becker, S.; Thrun, S.; Obermayer, K., Eds. MIT Press, 2002, Vol. 15.
- van der Maaten, L.; Hinton, G. Visualizing Data using t-SNE. Journal of Machine Learning Research 2008, 9, 2579–2605. [Google Scholar]
- Peruzzi, T. .; Pasquato, M..; Ciroi, S..; Berton, M..; Marziani, P..; Nardini, E.. Interpreting automatic AGN classifiers with saliency maps. Astronomy & Astrophysics 2021, 652, A19. [Google Scholar] [CrossRef]
- van der Maaten, L. Barnes-Hut-SNE, 2013. [arXiv:cs.LG/1301.3342].
- Nakoneczny, S.; Bilicki, M.; Solarz, A.; Pollo, A.; Maddox, N.; Spiniello, C.; Brescia, M.; Napolitano, N.R. Catalog of quasars from the Kilo-Degree Survey Data Release 3. Astronomy & Astrophysics 2019, 624, A13, [arXiv:astro-ph.IM/1812.03084]. [Google Scholar] [CrossRef]
- Zhang, X.; Feng, Y.; Chen, H.; Yuan, Q. Powerful t-SNE Technique Leading to Clear Separation of Type-2 AGN and H II Galaxies in BPT Diagrams. The Astrophysical Journal 2020, 905, 97, [arXiv:astro-ph.GA/2010.13037]. [Google Scholar] [CrossRef]
- Queiroz, A.B.A.; Anders, F.; Chiappini, C.; Khalatyan, A.; Santiago, B.X.; Nepal, S.; Steinmetz, M.; Gallart, C.; Valentini, M.; Dal Ponte, M.; et al. StarHorse results for spectroscopic surveys and Gaia DR3: Chrono-chemical populations in the solar vicinity, the genuine thick disk, and young alpha-rich stars. Astronomy & Astrophysics 2023, 673, A155, [arXiv:astro-ph.GA/2303.09926]. [Google Scholar] [CrossRef]
- Traven, G.; Feltzing, S.; Merle, T.; Van der Swaelmen, M.; Čotar, K.; Church, R.; Zwitter, T.; Ting, Y.S.; Sahlholdt, C.; Asplund, M.; et al. The GALAH survey: multiple stars and our Galaxy. I. A comprehensive method for deriving properties of FGK binary stars. Astronomy & Astrophysics 2020, 638, A145, [arXiv:astro-ph.SR/2005.00014]. [Google Scholar] [CrossRef]
- Steinhardt, C.L.; Weaver, J.R.; Maxfield, J.; Davidzon, I.; Faisst, A.L.; Masters, D.; Schemel, M.; Toft, S. A Method to Distinguish Quiescent and Dusty Star-forming Galaxies with Machine Learning. The Astrophysical Journal 2020, 891, 136, [arXiv:astro-ph.GA/2002.05729]. [Google Scholar] [CrossRef]
- Garcia-Cifuentes, K.; Becerra, R.L.; De Colle, F.; Cabrera, J.I.; Del Burgo, C. Identification of Extended Emission Gamma-Ray Burst Candidates Using Machine Learning. The Astrophysical Journal 2023, 951, 4, [arXiv:astro-ph.HE/2304.08666]. [Google Scholar] [CrossRef]
- Wenger, M.; Ochsenbein, F.; Egret, D.; Dubois, P.; Bonnarel, F.; Borde, S.; Genova, F.; Jasniewicz, G.; Laloë, S.; Lesteven, S.; et al. The SIMBAD astronomical database. The CDS reference database for astronomical objects. Astronomy and Astrophysics Supplement Series 2000, 143, 9–22, [arXiv:astro-ph/astro-ph/0002110]. [Google Scholar] [CrossRef]
- Allen, M.G. CDS - Strasbourg Astronomical Data Centre. In Proceedings of the Astronomical Data Analysis Software and Systems XXIX; Pizzo, R.; Deul, E.R.; Mol, J.D.; de Plaa, J.; Verkouter, H., Eds., January 2020, Vol. 527, Astronomical Society of the Pacific Conference Series, p. 751.
- Schwarz, G. Estimating the Dimension of a Model. Annals of Statistics 1978, 6, 461–464. [Google Scholar] [CrossRef]
- Hao, J.; McKay, T.A.; Koester, B.P.; Rykoff, E.S.; Rozo, E.; Annis, J.; Wechsler, R.H.; Evrard, A.; Siegel, S.R.; Becker, M.; et al. A GMBCG Galaxy Cluster Catalog of 55,424 Rich Clusters from SDSS DR7. The Astrophysical Journal Supplement Series 2010, 191, 254–274, [arXiv:astro-ph.CO/1010.5503]. [Google Scholar] [CrossRef]
- Duncan, K.J. All-purpose, all-sky photometric redshifts for the Legacy Imaging Surveys Data Release 8. Monthly Notices of the Royal Astronomical Society 2022, 512, 3662–3683, [arXiv:astro-ph.GA/2203.01949]. [Google Scholar] [CrossRef]
- Das, P.; Hawkins, K.; Jofré, P. Ages and kinematics of chemically selected, accreted Milky Way halo stars. Monthly Notices of the Royal Astronomical Society 2020, 493, 5195–5207, [arXiv:astro-ph.GA/1903.09320]. [Google Scholar] [CrossRef]
- D’Isanto, A.; Polsterer, K.L. Photometric redshift estimation via deep learning. Generalized and pre-classification-less, image based, fully probabilistic redshifts. Astronomy & Astrophysics 2018, 609, A111, [arXiv:astro-ph.IM/1706.02467]. [Google Scholar] [CrossRef]
- Lee, K.J.; Guillemot, L.; Yue, Y.L.; Kramer, M.; Champion, D.J. Application of the Gaussian mixture model in pulsar astronomy - pulsar classification and candidates ranking for the Fermi 2FGL catalogue. Monthly Notices of the Royal Astronomical Society 2012, 424, 2832–2840, [arXiv:astro-ph.IM/1205.6221]. [Google Scholar] [CrossRef]
- Cheng, T.Y.; Li, N.; Conselice, C.J.; Aragón-Salamanca, A.; Dye, S.; Metcalf, R.B. Identifying strong lenses with unsupervised machine learning using convolutional autoencoder. Monthly Notices of the Royal Astronomical Society 2020, 494, 3750–3765, [arXiv:astro-ph.IM/1911.04320]. [Google Scholar] [CrossRef]
- MacQueen, J. Some methods for classification and analysis of multivariate observations. In Proceedings of the Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. University of California Press, 1967, Vol. 1, pp. 281–297.
- Arthur, D.; Vassilvitskii, S. K-Means++: The Advantages of Careful Seeding. In Proceedings of the Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 01 2007, Vol. 8, pp. 1027–1035.
- Viticchié, B.; Sánchez Almeida, J. Asymmetries of the StokesVprofiles observed by HINODE SOT/SP in the quiet Sun. Astronomy & Astrophysics 2011, 530, A14. [Google Scholar] [CrossRef]
- Johnson, S.C. Hierarchical clustering schemes. Psychometrika 1967, 32, 241–254. [Google Scholar] [CrossRef] [PubMed]
- Dantas, M.L.L.; Smiljanic, R.; Boesso, R.; Rocha-Pinto, H.J.; Magrini, L.; Guiglion, G.; Tautvaišienė, G.; Gilmore, G.; Randich, S.; Bensby, T.; et al. The Gaia-ESO Survey: Old super-metal-rich visitors from the inner Galaxy. Astronomy & Astrophysics 2023, 669, A96, [arXiv:astro-ph.GA/2210.08510]. [Google Scholar] [CrossRef]
- Galli, P.A.B.; Loinard, L.; Bouy, H.; Sarro, L.M.; Ortiz-León, G.N.; Dzib, S.A.; Olivares, J.; Heyer, M.; Hernandez, J.; Román-Zúñiga, C.; et al. Structure and kinematics of the Taurus star-forming region from Gaia-DR2 and VLBI astrometry. Astronomy & Astrophysics 2019, 630, A137, [arXiv:astro-ph.SR/1909.01118]. [Google Scholar] [CrossRef]
- Kounkel, M.; Covey, K.; Suárez, G.; Román-Zúñiga, C.; Hernandez, J.; Stassun, K.; Jaehnig, K.O.; Feigelson, E.D.; Peña Ramírez, K.; Roman-Lopes, A.; et al. The APOGEE-2 Survey of the Orion Star-forming Complex. II. Six-dimensional Structure. The Astronomical Journal 2018, 156, 84, [arXiv:astro-ph.SR/1805.04649]. [Google Scholar] [CrossRef]
- Hojnacki, S.M.; Kastner, J.H.; Micela, G.; Feigelson, E.D.; LaLonde, S.M. An X-Ray Spectral Classification Algorithm with Application to Young Stellar Clusters. The Astrophysical Journal 2007, 659, 585–598. [Google Scholar] [CrossRef]
- Ester, M.; Kriegel, H.P.; Sander, J.; Xu, X.; et al. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the kdd, 1996, pp. 226–231.
- Castro-Ginard, A.; Jordi, C.; Luri, X.; Julbe, F.; Morvan, M.; Balaguer-Núñez, L.; Cantat-Gaudin, T. A new method for unveiling open clusters in Gaia. New nearby open clusters confirmed by DR2. Astronomy & Astrophysics 2018, 618, A59, [arXiv:astro-ph.GA/1805.03045]. [Google Scholar] [CrossRef]
- Zari, E.; Brown, A.G.A.; de Zeeuw, P.T. Structure, kinematics, and ages of the young stellar populations in the Orion region. Astronomy & Astrophysics 2019, 628, A123, [arXiv:astro-ph.SR/1906.07002]. [Google Scholar] [CrossRef]
- Yan, Q.Z.; Yang, J.; Su, Y.; Sun, Y.; Wang, C. Distances and Statistics of Local Molecular Clouds in the First Galactic Quadrant. The Astrophysical Journal 2020, 898, 80, [arXiv:astro-ph.GA/2006.13654]. [Google Scholar] [CrossRef]
- Price-Jones, N.; Bovy, J. Blind chemical tagging with DBSCAN: prospects for spectroscopic surveys. Monthly Notices of the Royal Astronomical Society 2019, 487, 871–886, [arXiv:astro-ph.GA/1902.08201]. [Google Scholar] [CrossRef]
- Castro-Ginard, A.; Jordi, C.; Luri, X.; Cantat-Gaudin, T.; Carrasco, J.M.; Casamiquela, L.; Anders, F.; Balaguer-Núñez, L.; Badia, R.M. Hunting for open clusters in Gaia EDR3: 628 new open clusters found with OCfinder. Astronomy & Astrophysics 2022, 661, A118, [arXiv:astro-ph.GA/2111.01819]. [Google Scholar] [CrossRef]
- Hunt, E.L.; Reffert, S. Improving the open cluster census. I. Comparison of clustering algorithms applied to Gaia DR2 data. Astronomy & Astrophysics 2021, 646, A104, [arXiv:astro-ph.GA/2012.04267]. [Google Scholar] [CrossRef]
- Campello, R.J.G.B.; Moulavi, D.; Sander, J. Density-Based Clustering Based on Hierarchical Density Estimates. In Proceedings of the Advances in Knowledge Discovery and Data Mining; Pei, J.; Tseng, V.S.; Cao, L.; Motoda, H.; Xu, G., Eds., Berlin, Heidelberg, 2013; pp. 160–172.
- McInnes, L.; Healy, J.; Astels, S. The hdbscan Clustering Library, 2016.
- Koppelman, H.H.; Helmi, A.; Massari, D.; Price-Whelan, A.M.; Starkenburg, T.K. Multiple retrograde substructures in the Galactic halo: A shattered view of Galactic history. Astronomy & Astrophysics 2019, 631, L9, [arXiv:astro-ph.GA/1909.08924]. [Google Scholar] [CrossRef]
- Hunt, E.L.; Reffert, S. Improving the open cluster census. II. An all-sky cluster catalogue with Gaia DR3. Astronomy & Astrophysics 2023, 673, A114, [arXiv:astro-ph.GA/2303.13424]. [Google Scholar] [CrossRef]
- Kerr, R.M.P.; Rizzuto, A.C.; Kraus, A.L.; Offner, S.S.R. Stars with Photometrically Young Gaia Luminosities Around the Solar System (SPYGLASS). I. Mapping Young Stellar Structures and Their Star Formation Histories. The Astrophysical Journal 2021, 917, 23, [arXiv:astro-ph.GA/2105.09338]. [Google Scholar] [CrossRef]
- Webb, S.; Lochner, M.; Muthukrishna, D.; Cooke, J.; Flynn, C.; Mahabal, A.; Goode, S.; Andreoni, I.; Pritchard, T.; Abbott, T.M.C. Unsupervised machine learning for transient discovery in deeper, wider, faster light curves. Monthly Notices of the Royal Astronomical Society 2020, 498, 3077–3094, [arXiv:astro-ph.IM/2008.04666]. [Google Scholar] [CrossRef]
- Moranta, L.; Gagné, J.; Couture, D.; Faherty, J.K. New Coronae and Stellar Associations Revealed by a Clustering Analysis of the Solar Neighborhood. The Astrophysical Journal 2022, 939, 94, [arXiv:astro-ph.SR/2206.04567]. [Google Scholar] [CrossRef]
- Shank, D.; Komater, D.; Beers, T.C.; Placco, V.M.; Huang, Y. Dynamically Tagged Groups of Metal-poor Stars. II. The Radial Velocity Experiment Data Release 6. The Astrophysical Journal Supplement Series 2022, 261, 19, [arXiv:astro-ph.GA/2201.08337]. [Google Scholar] [CrossRef]
- Cabrera Garcia, J.; Beers, T.C.; Huang, Y.; Li, X.Y.; Liu, G.; Zhang, H.; Hong, J.; Lee, Y.S.; Shank, D.; Gudin, D.; Probing the Galactic halo with RR Lyrae stars -, V.; et al. Chemistry, kinematics, and dynamically tagged groups. Monthly Notices of the Royal Astronomical Society 2024, 527, 8973–8990, [arXiv:astro-ph.GA/2307.09572]. [Google Scholar] [CrossRef]
- Dunn, J.C. A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters. Journal of Cybernetics 1973, 3, 32–57. [Google Scholar] [CrossRef]
- Bezdek, J. Pattern Recognition With Fuzzy Objective Function Algorithms; Springer New York, NY, 1981. [CrossRef]
- Kruse, R.; Döring, C.; Lesot, M.J. Fundamentals of Fuzzy Clustering. In Advances in Fuzzy Clustering and its Applications; John Wiley & Sons, Ltd, 2007; chapter 1, pp. 1–30, [https://onlinelibrary.wiley.com/doi/pdf/10.1002/9780470061190.ch1]. [CrossRef]
- Colazo, M.; Alvarez-Candal, A.; Duffard, R. Zero-phase angle asteroid taxonomy classification using unsupervised machine learning algorithms. Astronomy & Astrophysics 2022, 666, A77, [arXiv:astro-ph.EP/2204.05075]. [Google Scholar] [CrossRef]
- Shi, L.; He, P. A Fast Fuzzy Clustering Algorithm for Large-Scale Datasets. In Proceedings of the Advanced Data Mining and Applications; Li, X.; Wang, S.; Dong, Z.Y., Eds., Berlin, Heidelberg, 2005; pp. 203–208.
- Cheng, T.W.; Goldgof, D.B.; Hall, L.O. Fast fuzzy clustering. Fuzzy Sets and Systems 1998, 93, 49–56. [Google Scholar] [CrossRef]
- Szabó, G.M.; Kálmán, S.; Borsato, L.; Hegedus, V.; Mészáros, S.; Szabó, R. Sub-Jovian desert of exoplanets at its boundaries. Parameter dependence along the main sequence. Astronomy & Astrophysics 2023, 671, A132, [arXiv:astro-ph.EP/2301.01065]. [Google Scholar] [CrossRef]
- Modak, S. Distinction of groups of gamma-ray bursts in the BATSE catalog through fuzzy clustering. Astronomy and Computing 2021, 34, 100441, [arXiv:stat.AP/2101.03536]. [Google Scholar] [CrossRef]
- Li, H. Fuzzy Cluster Analysis: Application to Determining Metallicities for Very Metal-poor Stars. The Astrophysical Journal 2021, 923, 183, [arXiv:astro-ph.SR/2202.09973]. [Google Scholar] [CrossRef]
- Barra, V.; Delouille, V.; Hochedez, J.F. Segmentation of extreme ultraviolet solar images via multichannel fuzzy clustering. Advances in Space Research 2008, 42, 917–925. [Google Scholar] [CrossRef]
- Anilkumar, B.T.; Sabarinath, A. Grouping and long term prediction of sunspot cycle characteristics-A fuzzy clustering approach. Astronomy and Computing 2024, 48, 100836. [Google Scholar] [CrossRef]
- Offner, S.S.R.; Taylor, J.; Markey, C.; Chen, H.H.H.; Pineda, J.E.; Goodman, A.A.; Burkert, A.; Ginsburg, A.; Choudhury, S. Turbulence, coherence, and collapse: Three phases for core evolution. Monthly Notices of the Royal Astronomical Society 2022, 517, 885–909, [arXiv:astro-ph.GA/2006.07325]. [Google Scholar] [CrossRef]
- Bandyopadhyay, S.; Das, S.; Datta, A. Comparative Study and Development of Two Contour-Based Image Segmentation Techniques for Coronal Hole Detection in Solar Images. Solar Physics 2020, 295, 110. [Google Scholar] [CrossRef]
- Kohonen, T. Self-organized formation of topologically correct feature maps. Biological Cybernetics 1982, 43, 59–69. [Google Scholar] [CrossRef]
- Masters, D.; Capak, P.; Stern, D.; Ilbert, O.; Salvato, M.; Schmidt, S.; Longo, G.; Rhodes, J.; Paltani, S.; Mobasher, B.; et al. Mapping the Galaxy Color-Redshift Relation: Optimal Photometric Redshift Calibration Strategies for Cosmology Surveys. The Astrophysical Journal 2015, 813, 53, [arXiv:astro-ph.CO/1509.03318]. [Google Scholar] [CrossRef]
- Hildebrandt, H.; van den Busch, J.L.; Wright, A.H.; Blake, C.; Joachimi, B.; Kuijken, K.; Tröster, T.; Asgari, M.; Bilicki, M.; de Jong, J.T.A.; et al. KiDS-1000 catalogue: Redshift distributions and their calibration. Astronomy & Astrophysics 2021, 647, A124, [arXiv:astro-ph.CO/2007.15635]. [Google Scholar] [CrossRef]
- Wright, A.H.; Hildebrandt, H.; van den Busch, J.L.; Heymans, C. Photometric redshift calibration with self-organising maps. Astronomy & Astrophysics 2020, 637, A100. [Google Scholar] [CrossRef]
- Carrasco Kind, M.; Brunner, R.J. SOMz: photometric redshift PDFs with self-organizing maps and random atlas. Monthly Notices of the Royal Astronomical Society 2014, 438, 3409–3421, [arXiv:astro-ph.IM/1312.5753]. [Google Scholar] [CrossRef]
- Yuan, Z.; Myeong, G.C.; Beers, T.C.; Evans, N.W.; Lee, Y.S.; Banerjee, P.; Gudin, D.; Hattori, K.; Li, H.; Matsuno, T.; et al. Dynamical Relics of the Ancient Galactic Halo. The Astrophysical Journal 2020, 891, 39, [arXiv:astro-ph.GA/1910.07538]. [Google Scholar] [CrossRef]
- Armstrong, D.J.; Kirk, J.; Lam, K.W.F.; McCormac, J.; Osborn, H.P.; Spake, J.; Walker, S.; Brown, D.J.A.; Kristiansen, M.H.; Pollacco, D.; et al. K2 variable catalogue - II. Machine learning classification of variable stars and eclipsing binaries in K2 fields 0-4. Monthly Notices of the Royal Astronomical Society 2016, 456, 2260–2272, [arXiv:astro-ph.SR/1512.01246]. [Google Scholar] [CrossRef]
- Brett, D.R.; West, R.G.; Wheatley, P.J. The automated classification of astronomical light curves using Kohonen self-organizing maps. Monthly Notices of the Royal Astronomical Society 2004, 353, 369–376, [arXiv:astroph/astro-ph/0408118]. [Google Scholar] [CrossRef]
- Kramer, M.A. Nonlinear principal component analysis using autoassociative neural networks. AIChE Journal 1991, 37, 233–243. [Google Scholar] [CrossRef]
- Kramer, M. Autoassociative neural networks. Computers & Chemical Engineering 1992, 16, 313–328, Neutral network applications in chemical engineering. [Google Scholar] [CrossRef]
- Kingma, D.P.; Welling, M. Auto-Encoding Variational Bayes, 2022. [arXiv:stat.ML/1312.6114].
- Ralph, N.O.; Norris, R.P.; Fang, G.; Park, L.A.F.; Galvin, T.J.; Alger, M.J.; Andernach, H.; Lintott, C.; Rudnick, L.; Shabala, S.; et al. Radio Galaxy Zoo: Unsupervised Clustering of Convolutionally Auto-encoded Radio-astronomical Images. Publications of the Astronomical Society of the Pacific 2019, 131, 108011, [arXiv:astro-ph.IM/1906.02864]. [Google Scholar] [CrossRef]
- Savary, E.; Rojas, K.; Maus, M.; Clément, B.; Courbin, F.; Gavazzi, R.; Chan, J.H.H.; Lemon, C.; Vernardos, G.; Cañameras, R.; et al. Strong lensing in UNIONS: Toward a pipeline from discovery to modeling. Astronomy & Astrophysics 2022, 666, A1, [arXiv:astro-ph.CO/2110.11972]. [Google Scholar] [CrossRef]
- Ganeshaiah Veena, P.; Lilow, R.; Nusser, A. Large-scale density and velocity field reconstructions with neural networks. Monthly Notices of the Royal Astronomical Society 2023, 522, 5291–5307, [arXiv:astro-ph.CO/2212.06439]. [Google Scholar] [CrossRef]
- Shen, H.; George, D.; Huerta, E.A.; Zhao, Z. Denoising Gravitational Waves with Enhanced Deep Recurrent Denoising Auto-Encoders. arXiv e-prints 2019, p. arXiv:1903.03105. [arXiv:astro-ph.CO/1903.03105]. [CrossRef]
- Ichinohe, Y.; Yamada, S. Neural network-based anomaly detection for high-resolution X-ray spectroscopy. Monthly Notices of the Royal Astronomical Society 2019, 487, 2874–2880, [arXiv:astro-ph.IM/1905.13434]. [Google Scholar] [CrossRef]
- Bayley, J.; Messenger, C.; Woan, G. Rapid parameter estimation for an all-sky continuous gravitational wave search using conditional varitational auto-encoders. Physical Review D 2022, 106, 083022, [arXiv:astro-ph.IM/2209.02031]. [Google Scholar] [CrossRef]
- Liu, F.T.; Ting, K.M.; Zhou, Z.H. Isolation Forest. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, 2008, pp. 413–422. [CrossRef]
- Wen, J.; Ahmadzadeh, A.; Georgoulis, M.K.; Sadykov, V.M.; Angryk, R.A. Outlier Detection and Removal in Multivariate Time Series for a More Robust Machine Learning–based Solar Flare Prediction. The Astrophysical Journal Supplement Series 2025, 277, 60. [Google Scholar] [CrossRef]
- Pruzhinskaya, M.V.; Malanchev, K.L.; Kornilov, M.V.; Ishida, E.E.O.; Mondon, F.; Volnova, A.A.; Korolev, V.S. Anomaly detection in the Open Supernova Catalog. Monthly Notices of the Royal Astronomical Society 2019, 489, 3591–3608, [arXiv:astro-ph.HE/1905.11516]. [Google Scholar] [CrossRef]
- Villar, V.A.; Cranmer, M.; Berger, E.; Contardo, G.; Ho, S.; Hosseinzadeh, G.; Lin, J.Y.Y. A Deep-learning Approach for Live Anomaly Detection of Extragalactic Transients. The Astrophysical Journal Supplement Series 2021, 255, 24, [arXiv:astro-ph.HE/2103.12102]. [Google Scholar] [CrossRef]
- Sánchez-Sáez, P.; Lira, H.; Martí, L.; Sánchez-Pi, N.; Arredondo, J.; Bauer, F.E.; Bayo, A.; Cabrera-Vives, G.; Donoso-Oliva, C.; Estévez, P.A.; et al. Searching for Changing-state AGNs in Massive Data Sets. I. Applying Deep Learning and Anomaly-detection Techniques to Find AGNs with Anomalous Variability Behaviors. The Astronomical Journal 2021, 162, 206, [arXiv:astro-ph.IM/2106.07660]. [Google Scholar] [CrossRef]
- Chan, H.S.; Villar, V.A.; Cheung, S.H.; Ho, S.; O’Grady, A.J.G.; Drout, M.R.; Renzo, M. Searching for Anomalies in the ZTF Catalog of Periodic Variable Stars. The Astrophysical Journal 2022, 932, 118, [arXiv:astro-ph.SR/2112.03306]. [Google Scholar] [CrossRef]
- Angelis, D.; Sofos, F.; Karakasidis, T.E. Artificial Intelligence in Physical Sciences: Symbolic Regression Trends and Perspectives. Archives of Computational Methods in Engineering 2023, 30, 3845–3865. [Google Scholar] [CrossRef] [PubMed]
- Schmidt, M.; Lipson, H. Symbolic Regression of Implicit Equations. In Genetic Programming Theory and Practice VII; Riolo, R., O’Reilly, U.M., McConaghy, T., Eds.; Springer US: Boston, MA, 2010; pp. 73–85. [Google Scholar] [CrossRef]
- Llorella, F.R.; Cebrian, J.A. Exploring Symbolic Regression and Genetic Algorithms for Astronomical Object Classification. The Open Journal of Astrophysics 2025, 8, 27, [arXiv:astro-ph.GA/2503.09220]. [Google Scholar] [CrossRef]
- Tan, B. Neural infalling cloud equations (NICE): increasing the efficacy of subgrid models and scientific equation discovery using neural ODEs and symbolic regression. Monthly Notices of the Royal Astronomical Society 2025, 537, 3383–3395, [arXiv:astro-ph.GA/2408.10387]. [Google Scholar] [CrossRef]
- Lemos, P.; Jeffrey, N.; Cranmer, M.; Ho, S.; Battaglia, P. Rediscovering orbital mechanics with machine learning. Machine Learning: Science and Technology 2023, 4, 045002, [arXiv:astro-ph.EP/2202.02306]. [Google Scholar] [CrossRef]
- Delgado, A.M.; Wadekar, D.; Hadzhiyska, B.; Bose, S.; Hernquist, L.; Ho, S. Modelling the galaxy-halo connection with machine learning. Monthly Notices of the Royal Astronomical Society 2022, 515, 2733–2746, [arXiv:astro-ph.CO/2111.02422]. [Google Scholar] [CrossRef]
- Gebhardt, M.; Anglés-Alcázar, D.; Borrow, J.; Genel, S.; Villaescusa-Navarro, F.; Ni, Y.; Lovell, C.C.; Nagai, D.; Davé, R.; Marinacci, F.; et al. Cosmological baryon spread and impact on matter clustering in CAMELS. Monthly Notices of the Royal Astronomical Society 2024, 529, 4896–4913, [arXiv:astro-ph.GA/2307.11832]. [Google Scholar] [CrossRef]













| PCA | MDS | Isomap | LLE | t-SNE | |
|---|---|---|---|---|---|
| Spectral data | 251 | 1 | 2 | 5 | 18 |
| Image | 106 | 0 | 1 | 0 | 7 |
| Catalogs | 35 | 0 | 1 | 5 | 14 |
| Photometry data | 46 | 0 | 0 | 2 | 7 |
| Light curves | 31 | 0 | 0 | 3 | 4 |
| Polarimetric data | 9 | 0 | 0 | 0 | 0 |
| Latent space | 3 | 0 | 1 | 0 | 0 |
| GMM | K-means | HC | DBSCAN | HDBSCAN | FCC | |
|---|---|---|---|---|---|---|
| Spectral data | 43 | 76 | 19 | 18 | 11 | 1 |
| Catalogs | 48 | 27 | 22 | 31 | 22 | 1 |
| Image | 36 | 70 | 13 | 13 | 2 | 3 |
| Photometry data | 51 | 14 | 8 | 14 | 26 | 1 |
| Bivariate data | 11 | 11 | 12 | 1 | 1 | 0 |
| Light curves | 10 | 9 | 3 | 3 | 3 | 0 |
| Polarimetric data | 2 | 7 | 0 | 0 | 0 | 0 |
| Latent space | 3 | 0 | 0 | 0 | 0 | 0 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).