Submitted:
23 December 2025
Posted:
24 December 2025
You are already at the latest version
Abstract
Keywords:
Methods
Method Overview
Inputs
- Radial distance r, measured in kiloparsecs
- Baryonic velocity Vb, measured in kilometers per second
Fixed Numerical Constants
-
Constant AsValue: 1.0Units: dimensionless
-
Constant BValue: 0.35Units: dimensionless
-
Constant CValue: 8.0Units: kiloparsecs
-
Constant DValue: −1.0Units: dimensionless exponentNo additional constants are introduced or implied.
Numerical Evaluation Procedure
- T = 1.0 + r / 8.0
- U = T^(−1.0)
- W = 0.35 × U
- Z = 1.0 + W
- S = √Z
- V_DE5 = Vb × S
Output
-
Predicted circular velocity V_DE5, measured in kilometers per secondNo additional transformations, adjustments, or evaluations are performed within this method.
Results
- Mean RMSE: 1.827524
- Median RMSE: 0.923729
- Mean Pearson: 0.983859
- Median Pearson: 0.998854
- Mean Lin’s CCC: 0.982975
- Median Lin’s CCC: 0.998853
- Mean RMSE: 1.635571
- Median RMSE: 0.911535
- RMSE outliers removed: 5
- Mean Pearson: 0.997288
- Median Pearson: 0.999120
- Pearson outliers removed: 29
- Mean Lin’s CCC: 0.997275
- Median Lin’s CCC: 0.999120
- CCC outliers removed: 29
Conclusions
Funding
Supplementary Materials Statement
Data Availability Statement
Data Source
- Radial distance from galactic center
- Observed rotation velocity
- Baryonic rotation velocity derived from luminous matter
Ethical Statement
Acknowledgments
Appendixes A–D. Intentionally Omitted
Appendix E. DE5 Velocity Definition and Numerical Evaluation
Appendix E.1. Purpose
Appendix E.2. Inputs
- Radial distance r[i], measured in kiloparsecs
- Baryonic velocity Vb[i], measured in kilometers per second
Appendix E.3. Fixed Numerical Constants
Appendix E.4. DE5 Velocity Evaluation
- Divide r[i] by 8.0
- Add 1.0 to the result
- T[i] = 1.0 + r[i] / 8.0
- U[i] = T[i] raised to the power −1.0
- W[i] = 0.35 × U[i]
- Z[i] = 1.0 + W[i]
- S[i] = square root of Z[i]
- V_DE5[i] = Vb[i] × S[i]
Appendix E.5. Output
- Predicted circular velocity V_DE5[i], measured in kilometers per second
Appendix E.6. Explicit Formula (Single-Line Form)
Appendix F. Numerical Implementation (Executable Python)
Appendix F.1. Purpose
- Runs immediately using internally generated synthetic data
- Implements exactly the same numerical operations used in the paper
- Produces valid outputs without any external files
- Gives explicit instructions for reproducing the reported numbers using real SPARC data
Appendix F.2. Executable Python Code (Synthetic Data Included)
| Full Mean | Full Median | Cleaned Mean | Cleaned Median | Outliers Removed | |
| RMSE (DE5) | 1.827524 | 0.923729 | 1.635571 | 0.911535 | 5.0 |
| Pearson Correlation (DE5) | 0.983859 | 0.998854 | 0.997288 | 0.999120 | 29.0 |
| Lin's CCC (DE5) | 0.982975 | 0.998853 | 0.997275 | 0.999120 | 29.0 |
Appendix G. Guide to Using This Colab Notebook
Appendix G.1. Instructions
Appendix G.1.1. Getting Started: Google Colab Setup
- Open the Notebook: Ensure you are in a Google Colab environment. If you received this notebook as a file, upload it to Colab by going to File > Upload notebook.
- Runtime: It's recommended to use a standard Python 3 runtime. No special GPU/TPU is needed.
Appendix G.1.2. Data Acquisition: SPARC Data (Rotmod_LTG.zip)
- You will need to acquire this file from a reliable astronomical data source. A common source for SPARC data is typically linked through academic papers or astronomical data repositories.
- Search online for "SPARC data Rotmod_LTG.zip" or similar terms to find a download link.
- Once you have the Rotmod_LTG.zip file downloaded to your local machine, return to this Google Colab notebook.
- Click the folder icon (Files) on the left sidebar to open the File Browser.
- Click the Upload to session storage icon (it looks like a file with an up arrow) in the File Browser pane.
- Navigate to where you saved Rotmod_LTG.zip on your local machine and select it.
- Crucially, ensure the file is uploaded directly into the root /content/ directory. Do not place it in any subfolders by default.
- Verify Upload: After uploading, you can run !ls -lh /content/ in a new code cell to confirm Rotmod_LTG.zip is listed and has a reasonable file size (typically around 100-700 KB). If the file is not there, or is very small, the upload may have failed or the file is corrupted.
Appendix G.1.3. Running the Notebook
- Clicking Runtime > Run all.
- Or, running each cell individually by clicking the "Play" button (▶) next to the cell.
-
Setup and Data Loading: Cells related to setting up directories, defining DE5 functions, and loading Rotmod_LTG.zip (e.g., the large cell containing import numpy as np, import pandas as pd, etc.). This cell will perform many critical steps:
- Verifying and unzipping Rotmod_LTG.zip.
- Loading galaxy data.
- Generating DE5 predictions (original, new).
- Applying vertical shifts to all velocity curves.
- Applying linear corrections to the shifted DE5 model.
- Parsing hardcoded galaxy classification data.
- Individual Outputs and Master Excel Creation: The cell that generates individual CSVs, PNGs, and the final galaxy_analysis_summary.xlsx file. This cell will save many files to the /content/individual_galaxy_outputs_final/ directory.
- Outlier Analysis (Optional): Cells that perform outlier detection and recalculate metrics after outlier removal. This provides a more robust understanding of model performance.
- Summary of Metrics: The cell presenting the comparison of synthetic vs. real data metrics, including outlier analysis results.
Appendix G.1.4. Understanding the Outputs
- {galaxy_name}_velocities.csv: For each galaxy, a CSV containing its radial distances (r), observed shifted velocity (Vb_observed), and the final DE5 model prediction (DE5).
- {galaxy_name}_velocity_plot.png: For each galaxy, a PNG image showing the observed velocity curve compared to the DE5 model prediction.
- master_galaxy_performance_summary_corrected.csv: A CSV file summarizing key performance metrics (RMSE, Pearson Correlation, Lin's CCC) for the DE5 model for all galaxies.
-
galaxy_analysis_summary.xlsx: This is the main consolidated output. It's an Excel workbook structured as follows:
- Overall_Summary sheet: Contains the df_master_summary table (performance metrics for all galaxies).
-
Individual Galaxy Sheets: For each galaxy, a dedicated sheet (named after the galaxy) includes:
- Its numerical velocity data table (radial distance, observed velocity, DE5 model predictions).
- An embedded image of its corresponding velocity plot.
- Global Metrics: Initial RMSE, Pearson, Lin CCC, and the global a and b coefficients for the entire SPARC dataset.
- Outlier Analysis Summary: A table showing mean/median of RMSE, Pearson, and Lin's CCC before and after outlier removal, along with the count of outliers. This provides insight into the robustness of the metrics.
Appendix G.1.5. Downloading the Master Excel Workbook
- Click the folder icon (Files) on the left sidebar.
- Navigate to /content/individual_galaxy_outputs_final/.
- Right-click on the file named galaxy_analysis_summary.xlsx.
- Select Download from the context menu.
Appendix G.1.6. Troubleshooting
- ModuleNotFoundError: If you encounter this, ensure the necessary pip install commands at the beginning of the notebook (if any) have been run successfully.
- RuntimeError: Manually uploaded Rotmod_LTG.zip is missing...: This means the Rotmod_LTG.zip file was not found or was too small/corrupted after upload. Re-check your upload to /content/ as per section 2.
- Plots/Excel warnings about missing files: If the Excel sheet shows WARNING: Plot file not found..., it indicates that the individual PNG plots were not generated or were saved in a different location. Ensure the cells generating individual plots (plt.savefig) were executed correctly.
References
- Golub, G. H.; Van Loan, C. F. Matrix Computations, 4th ed.; Johns Hopkins University Press, 2013. [Google Scholar]
- Willmott, C. J.; Matsuura, K. Advantages of the Mean Absolute Error (MAE) over the Root Mean Square Error (RMSE). Climate Research 2005, 30, 79–82. [Google Scholar] [CrossRef]
- Pearson, K. Notes on Regression and Inheritance in the Case of Two Parents. Proceedings of the Royal Society of London 1895, 58, 240–242. [Google Scholar] [CrossRef]
- Lin, L. I.-K. A Concordance Correlation Coefficient to Evaluate Reproducibility. Biometrics 1989, 45, 255–268. [Google Scholar] [CrossRef] [PubMed]
- Tukey, J. W. Exploratory Data Analysis; Addison–Wesley, 1977. [Google Scholar]
- Rousseeuw, P. J.; Hubert, M. Robust Statistics for Outlier Detection. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 2018, 8, e1236. [Google Scholar] [CrossRef]
- Aggarwal, C. C. Outlier Analysis, 2nd ed.; Springer, 2017. [Google Scholar]
- Maronna, R. A.; Martin, R. D.; Yohai, V. J.; Salibián-Barrera, M. Robust Statistics: Theory and Methods, 2nd ed.; Wiley, 2019. [Google Scholar]
- Jolliffe, I. T.; Cadima, J. Principal Component Analysis: A Review and Recent Developments. Philosophical Transactions of the Royal Society A 2016, 374, 20150202. [Google Scholar] [CrossRef] [PubMed]
- Benjamini, Y.; Hochberg, Y. Controlling the False Discovery Rate: Recent Developments. Journal of the Royal Statistical Society: Series B 2020, 82, 481–509. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
