Submitted:
30 May 2025
Posted:
02 June 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Theoretical Framework
2.1. Nonrandom DNA Walker
- ,
- Z is the strictly positive normalization function (partition sum) guaranteeing that,
- β and A are co-determined by the following equation:
- .
- Generally, q can be understood as the degree of nonrandomness of the ordering process under scrutiny since for q → 1 the Boltzmann-Gibbs entropy functional is obtained:
- ,
- ,
2.2. Data Acquisition
2.2.1. HERV Database
2.2.2. Construction of Data
4. Results
4.1. Hurst Exponent

4.2. q Stationary (Tsallis)
4.3. Complexity Factor (COFA)
4.4. K-Means Clustering
5. Discussion
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Data and Code Availability
Conflicts of Interest
References
- Albuquerque, H. A., Silva, R., & Alcaniz, J. S. (2004). Tsallis Statistics 335 and the Genetic Code. Physics Letters A. 324, 383–390.
- Alldredge, J.; Kumar, V.; Nguyen, J.; Sanders, B.E.; Gomez, K.; Jayachandran, K.; Zhang, J.; Schwarz, J.; Rahmatpanah, F. Endogenous Retrovirus RNA Expression Differences between Race, Stage and HPV Status Offer Improved Prognostication among Women with Cervical Cancer. Int. J. Mol. Sci. 2023, 24, 1492. [Google Scholar] [CrossRef] [PubMed]
- Basu, A.; Bobrovnikov, D.G.; Qureshi, Z.; Kayikcioglu, T.; Ngo, T.T.M.; Ranjan, A.; Eustermann, S.; Cieza, B.; Morgan, M.T.; Hejna, M.; et al. Measuring DNA mechanics on the genome scale. Nature 2020, 589, 462–467. [Google Scholar] [CrossRef] [PubMed]
- Bundschuh, R.; Gerland, U. Dynamics of intramolecular recognition: Base-pairing in DNA/RNA near and far from equilibrium. Eur. Phys. J. E 2006, 19, 319–329. [Google Scholar] [CrossRef] [PubMed]
- Calero-Layana, M., L´opez-Cruz, C., Ocan˜a, A., Tejera, E., & Armijos Jaramillo, V. (2022). Evolutionary analysis of endogenous intronic retroviruses in primates reveals an enrichment in transcription binding sites associated with key regulatory processes. PeerJ. 10, e14431.
- Chuong, E.B.; Elde, N.C.; Feschotte, C. Regulatory activities of transposable elements: from conflicts to benefits. Nat. Rev. Genet. 2016, 18, 71–86. [Google Scholar] [CrossRef] [PubMed]
- Chuong, E.B. The placenta goes viral: Retroviruses control gene expression in pregnancy. PLOS Biol. 2018, 16, e3000028. [Google Scholar] [CrossRef]
- Correia, J.; Silva, R.; Anselmo, D.; da Silva, J. Bayesian inference of length distributions of human DNA. Chaos, Solitons Fractals 2022, 160. [Google Scholar] [CrossRef]
- Davies, D.L. , and Bouldin, D.W. (1979). A cluster separation measure, IEEE transactions on Pattern Analysis and Machine Intelligence PAMI-1, 224-227.
- Ferri, G.L. , Reynoso Savio, M.F., and Plastino, A. (2010). Tsallis’ q triplet and the ozone layer. Physica A: Statistical Mechanics and Its Applications 389, 1829–1833.
- Frey, B.J. , Delong, A.T., and Xiong, H.Y. (2019). U.S. Patent Application No. 16/179, 280.
- Gonzalez-Cao, M. , Iduma, P., Karachaliou, N., Santarpia, M., Blanco, J., & Rosell, R. (2016). Human endogenous retroviruses and cancer. Cancer biology & medicine, 13, 483.
- Ivancevic, A.; Simpson, D.M.; Joyner, O.M.; Bagby, S.M.; Nguyen, L.L.; Bitler, B.G.; Pitts, T.M.; Chuong, E.B. Endogenous retroviruses mediate transcriptional rewiring in response to oncogenic signaling in colorectal cancer. Sci. Adv. 2024, 10, eado1218. [Google Scholar] [CrossRef]
- Jansz, N.; Faulkner, G.J. Endogenous retroviruses in the origins and treatment of cancer. Genome Biol. 2021, 22, 1–22. [Google Scholar] [CrossRef]
- Nath, A. , Li, W., Wang, T., Doucet-O’Hare, T., & Lee, M. (2019). A novel pathogenic role for “Junk DNA” in neurodegenerative diseases and neurodevelopmental tumors (S29. 006).
- Karakatsanis, L.P.; Pavlos, E.G.; Tsoulouhas, G.; Stamokostas, G.L.; Mosbruger, T.; Duke, J.L.; Pavlos, G.P.; Monos, D.S. Spatial constrains and information content of sub-genomic regions of the human genome. iScience 2021, 24, 102048. [Google Scholar] [CrossRef] [PubMed]
- Karakatsanis, L.; Pavlos, G.; Iliopoulos, A.; Pavlos, E.; Clark, P.; Duke, J.; Monos, D. Assessing information content and interactive relationships of subgenomic DNA sequences of the MHC using complexity theory approaches based on the non-extensive statistical mechanics. Phys. A: Stat. Mech. its Appl. 2018, 505, 77–93. [Google Scholar] [CrossRef]
- Kojima, S.; Yoshikawa, K.; Ito, J.; Nakagawa, S.; Parrish, N.F.; Horie, M.; Kawano, S.; Tomonaga, K. Virus-like insertions with sequence signatures similar to those of endogenous nonretroviral RNA viruses in the human genome. Proc. Natl. Acad. Sci. 2021, 118. [Google Scholar] [CrossRef] [PubMed]
- Li, W. (1992). Generating Nontrivial Long-Range Correlations and 1/f Spectra by Replication and Mutation. Physical Review A, 43, 5240-5260.
- Libbrecht, M.W.; Noble, W.S. Machine learning applications in genetics and genomics. Nat. Rev. Genet. 2015, 16, 321–332. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Manogaran, G.; Vijayakumar, V.; Varatharajan, R.; Kumar, P.M.; Sundarasekar, R.; Hsu, C.-H. Machine Learning Based Big Data Processing Framework for Cancer Diagnosis Using Hidden Markov Model and GM Clustering. Wirel. Pers. Commun. 2017, 102, 2099–2116. [Google Scholar] [CrossRef]
- Paces, Jan, Adam Pavl´ıcek, and V´aclav Paces. ”HERVd: database of human endogenous retroviruses.” Nucleic acids research 30, no. 1 (2002):205-206.
- Paces, Jan, Adam Pavl´ıˇcek, Radek Zika, Vladimir V. Kapitonov, Jerzy Jurka, and V´aclav Paˇces. ”HERVd: the human endogenous retroviruses database: update.” Nucleic acids research 32, no. suppl 1 (2004): D50-D50.
- Pavlos, G.; Karakatsanis, L.; Iliopoulos, A.; Pavlos, E.; Xenakis, M.; Clark, P.; Duke, J.; Monos, D. Measuring complexity, nonextensivity and chaos in the DNA sequence of the Major Histocompatibility Complex. Phys. A: Stat. Mech. its Appl. 2015, 438, 188–209. [Google Scholar] [CrossRef]
- Russ, E.; Iordanskiy, S. Endogenous Retroviruses as Modulators of Innate Immunity. Pathogens 2023, 12, 162. [Google Scholar] [CrossRef]
- Tsallis, C. (2004). Dynamical scenario for nonextensive statistical mechanics. In Physica A: Statistical Mechanics and its Applications 340,1–10.
- Tsallis, C. (2009). Introduction to Nonextensive Statistical Mechanics: Approaching a complex world (Springer).
- Tsallis, C. (2022). Entropy. Encyclopedia, 2, 264-300.
- Vargiu, L.; Rodriguez-Tomé, P.; Sperber, G.O.; Cadeddu, M.; Grandi, N.; Blikstad, V.; Tramontano, E.; Blomberg, J. Classification and characterization of human endogenous retroviruses; mosaic forms are common. Retrovirology 2016, 13, 1–29. [Google Scholar] [CrossRef]
- Varma, M.; Paskov, K.M.; Jung, J.-Y.; Chrisman, B.S.; Stockham, N.T.; Washington, P.Y.; Wall, D.P. Outgroup Machine Learning Approach Identifies Single Nucleotide Variants in Noncoding DNA Associated with Autism Spectrum Disorder. Proceedings of the Pacific Symposium. LOCATION OF CONFERENCE, United StatesDATE OF CONFERENCE; pp. 260–271.
- Washburn, J.D.; Mejia-Guerra, M.K.; Ramstein, G.; Kremling, K.A.; Valluru, R.; Buckler, E.S.; Wang, H. Evolutionarily informed deep learning methods for predicting relative transcript abundance from DNA sequence. Proc. Natl. Acad. Sci. 2019, 116, 5542–5549. [Google Scholar] [CrossRef]
- Weron, R. Estimating long-range dependence: finite sample properties and confidence intervals. Phys. A: Stat. Mech. its Appl. 2002, 312, 285–299. [Google Scholar] [CrossRef]
- Wong, F., & Gunawardena, J. (2020). Gene regulation in and out of equilibrium. Annual review of biophysics. 49, 199–226.







![]() |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
