Preprint
Article

This version is not peer-reviewed.

Geometrodynamics of Nipah Virus Evolution

Submitted:

06 April 2026

Posted:

08 April 2026

You are already at the latest version

Abstract
In this paper, 32 complete genomic sequences of human-origin Nipah viruses registered in GenBank® from 1999 to 2025 are analyzed by tracking distributions of ATG triplets using a recently developed search method by us. The trajectories of these triplets are constructed, and their divergence parameter is calculated, revealing an increased value typical of highly mutated viruses that can form separate families with distinct properties. This conclusion aligns with the large deviation observed in the fractal dimension parameters of the distribution of inter-ATG-triplet distances in genomic sequences. The simulation results are compared with earlier data obtained for SARS-CoV-2, MERS-CoV, dengue, and Ebola viruses.
Keywords: 
;  ;  

1. Introduction

The Nipah virus, or Henipavirus Hipahense, was first identified in Malaysia in 1998 and subsequently spread to several Southeast Asian countries, including Bangladesh, India, Indonesia, the Philippines, and Singapore, among humans, pigs, and dogs. The last time an infection was registered in India was at the beginning of 2026 [1].
This zoonotic virus originated in bats and spread among humans and animals through contaminated fruits. The disease is also transmitted between people through biological fluids and respiratory droplets in crowded hospital settings. The mortality rate ranges from 40% to 80% [2,3,4,5,6,7,8,9,10], reaching 91% during the 2018-2019 outbreak in India [2,10]. The virus mainly affects the brain (encephalitis) and the lungs. Currently, no effective treatments or vaccines are available, although many approaches are under research and development [11,12].
Due to random mutations, the virus varies; for example, the genome of the Malaysian strain (NiV-M) evolves at a rate of 4.64 × 10-4 substitutions per site per year [7]. Additionally, the potential formation of new clusters was shown by studying the genomes of bat and human viruses during the Kerala (India) 2018-2019 outbreak [10]. Therefore, monitoring genomic variation is essential because of this virus's high mutation rate and the need to develop adaptive treatments and vaccines for this severe infection.
Genomic studies start with instrumented sequencing of isolated viral RNAs, followed by numerical analysis of the resulting sequences to identify genes and build phylogenetic trees of mutated viral samples, comparing them symbol by symbol and calculating neighborhood distances. It provides criteria for dividing sequences into distinct clades that may have different pathogenic effects. From a mathematical standpoint, this process involves problems that are exponentially complex, and their solutions can demand increasing computational resources. Besides the described approach, genomic sequences can also be compared using non-alignment techniques that do not require symbol-to-symbol comparisons. Some of these involve creating images of viral sequences and analyzing them qualitatively and quantitatively (see review [13]).
The goal of this contribution is to study the NiV’s genomic sequences using a recently developed non-alignment approach [13,18,19], allowing for tracing and imaging the ATG-triplets positions in genomic sequences in the form of analytically simple trajectories, with the following calculation of divergence of these trajectories and fractal parameters of their distributions along analyzed genomic traces.
Although the method tracks any repeating patterns, including individual nucleotides, the ATG observations focus on severe mutations and on the variation in the number of ATG triplets and codons in the studied sequences.
The paper is organized as follows. Section 2 provides a brief review of existing sequencing instrumentation and software tools. A list of studied human-origin complete genomic sequences is included in the Appendix, related to this and the following parts of the paper. Simulation results are shown in Section 3 and discussed in Section 4.

2. Materials and Methods

Because the virus constantly mutates, monitoring its genomes and biological traits is crucial. Genomic sequences are obtained through various instruments and software tools, with accuracy improving over time [20,21,22]. Much of the virus data collected worldwide is stored in GenBank® [23]. Unfortunately, only a limited number of human-origin Nipah virus sequences are available, and most of these samples were analyzed in this study. The list of these genomes and their calculated parameters is provided in Table A1 of the Appendix A.
So far, all Nipah sequences are classified into several main groups: NiV-M (Malaysia), NiV-BD1, and -BD-2 (Bangladesh). Another group mentioned in the literature is NiV-I (India), identified after the 2018-2019 outbreak in Kerala, India [10]. This classification was based on analyses of phylogenetic trees generated using various mathematical methods and software tools [2,3,4,5,6,7,8,9,10,11,12], as well as observations of disease symptoms, death rates, and other factors.
Beyond purely numerical phylogenetic tree structures derived from time-consuming alignment methods, viral genomes can also be examined by constructing geometric representations or genomic walks of different types (reviewed in Ref. [13]). Typically, the positions of amino acids along a sequence form a set of points that can be connected by lines to create a curve known as a DNA walk [14].
In some cases, depending on the visualization techniques used [13,14,15,16,17], these sets of points are easy to analyze or can be analyzed with statistical and signal-processing, without needing algorithms that are exponentially complex.
In Ref. [18], a new algorithm of this kind was proposed and examined, enabling the identification of consecutive amino acid sequences or their repeated combinations in DNAs. These points are connected by imaginary stretches, arranged along a simple curve that winds around a straight line that is easy to analyze. Depending on the desired level of detail, the separate trajectories for each nucleic acid or any repeated amino acid patterns can be plotted in a single figure. One of these visualization techniques involves tracing ATG triplets, which start DNA codons or are parts of them.
Using these ATG walks, the evolution of SARS-CoV-2, MERS-CoV-1, dengue, and Ebola viruses was studied in 2021[18]. Compared to the last two viruses, it was predicted that SARS-CoV-2 has a relatively stable ATG structure, meaning it is less affected by significant variations in codon length and their counts in viral sequences. Despite ongoing mutations, these virus properties remain valid today [13,19].
The techniques discussed above are applied here to analyze Nipah virus human-origin sequences available in GenBank® [23].

3. Results

The ATG walks were derived from 32 complete viral genome samples of NiV-M, NiV-BD, and NiV-I. The simulation results are shown in Figure 1a,b, and in Table A1 (Appendix A). It is observed that most traces (NiV-M and NiV-BD) are very similar, although the cumulative differences between them increase toward the end of the sequences.
In addition to the sequences mentioned, four DNA samples collected in Kerala in 2018 were modeled, and the curves labeled 22-25 are shown in the enlarged Figure 1b. It was observed that these curves are positioned relatively farther from the other 28s, especially beyond the centers of the sequences. This indicates that, in this region, the inter-ATG intervals are shorter than those in the NiV-M and NiV-BD samples.
All sequences can be quite similar in length, as microbiologists previously mentioned [2], but RNAs carry spatial information, and the differences between sequences lie in how the nucleotides are distributed in them.
In addition to visual analysis, the method enables quantitative estimates. For this purpose, the relative width of a set of 32 trajectories is used.
For comparison, the normalized cumulative difference of ATG curves is calculated at two points at x = 12620   and   17930 . It gives, correspondingly, δ y 1 , 32 = 2 y 32 y 1 / y 1 + y 32 6.2 % ,   3 . 4 % after more than 27 years of observed evolution (1999-2025, Table A1, Appendix A).
To compare, the divergence ATG data for SARS-CoV-2, MERS-CoV, dengue, Ebola [19], and Nipah viruses are provided in Table 1 (row 4).
Analyzing these data, it should be noted that each virus has its own rate of evolution, determined by its nature and the speed of its mutation and spread among host organisms. The record in this regard belongs to the highly pathogenic SARS-CoV-2 virus, which has now developed hundreds of lineages and clades, and has killed more than seven million people worldwide [1]. Despite this, studies conducted in Refs. [13,18,19] for 108 most widespread sequence lineages did not show significant divergence in the SARS-CoV-2 ATG curves, similar to what was observed with MERS-CoV, and these viruses were recognized temporarily as relatively stable species. Although random mutations may reverse this opinion in the future.
Other viruses listed in Table 1 have longer evolutionary histories and have been separated into different clades and families, even within closely related geographic regions. Some of these clusters can be observed building ATG curves.
Besides estimates of divergence for ATG walks, a comprehensive statistical parameter—the fractal regularization dimension of inter-ATG distances—was computed for all 32 genomes. These distances are measured in nucleotide counts, and their distributions along the genomes were analyzed using the software tool FracLab 2.2 [26], as in Refs. [13,18,19]. Figure 2 shows the calculated fractal dimensions for all 32 genomic sequences.
As in Ref. [19], the standard deviation (std) was calculated for these points and compared with the data for the other five viruses from Refs. [18] and [19].
It is observed that the relatively stable viruses SARS-CoV-2 and MERS-CoV have considerably smaller standard deviations in their fractal dimension parameters compared to dengue, Ebola, and Nipah viruses, which have evolved toward clustering (Table 1, row 5).

4. Conclusions

This study applied a recently proposed ATG-walk method [18] to Nipah genomes to identify and analyze significant mutations, as well as variations in codon length and their frequencies, across 32 human-origin genomes deposited in GenBank® between 1999 and 2025 by various contributors.
By analyzing available genomic Nipah samples and previously studied sequences of SARS-CoV-2, MERS-CoV, dengue, and Ebola, we can generalize certain properties to many viruses. Among them:
  • Continuous mutations, along with variations in the number of ATG triplets, codons, and non-coding elements, lead to an increase in the divergence of ATG trajectories measured quantitatively.
  • In some cases, specific ATG trajectories form distinct clusters associated with the virus seen visually.
  • So far, a link between virus genomes capable of causing strong mutations and increased standard deviation in ATG walk fractal properties is known.
It's important to recognize that the viral landscape is highly unpredictable, and ongoing surveillance of viral genomes is strongly advised to prevent pandemics [3,22] as demonstrated with SARS-CoV-2. The ATG walk method is one of these monitoring tools [13,14,15,16,17,18,19] for detecting severe viral mutations and can complement existing viral research methods.

Acknowledgments

The author thanks all researchers who have deposited the viral genomic sequences in the GenBank® and GISAID databases. The author thanks the Faculty of Information Technology and Electroengineering at the Norwegian University of Science and Technology, NTNU, for a sabbatical research grant in 2026 and Mach-3dP Inc. (Burlington, Canada) for hosting during the sabbatical.

Conflicts of Interest

The author declares that they have no competing interests.

Appendix A

Table A1. Henipavirus hipahense (Nipah virus) genomic sequence data.
Table A1. Henipavirus hipahense (Nipah virus) genomic sequence data.
No GenBank complete sequence number Country Collection year Number of nucleotides Number of ATG triplets Fractal dimension
D F
1 AJ564623.1 Malaysia 1999 18246 384 2.28
2 MK673562.1 Malaysia 1999 18231 383 2.28
3 AF212302.2 Malaysia 2001 18246 384 2.28
4 AY029767.1 Malaysia 2001 18246 384 2.28
5 AY029768.1 Malaysia 2001 18246 384 2.28
6 MK673565.1 Malaysia 2004 18235 384 2.29
7 MK633567.1 Bangladesh 2004 18245 383 2.29
8 AY988601.1 Bangladesh 2005 18252 384 2.28
9 FJ513078.1 India 2007 18252 378 2.39
10 FJ513078.1 India 2007 18252 378 2.39
11 MK673568.1 Malaysia 2008 18237 386 2.29
12 JN808863.1 Bangladesh 2008 18242 372 2.38
13 MK673571.1 Bangladesh 2011 18240 382 2.29
14 MK673573.1 Bangladesh 2011 18236 382 2.29
15 MK673578.1 Bangladesh 2011 18235 381 2.38
16 MK673581.1 Bangladesh 2012 18235 381 2.37
17 MK673591.1 Bangladesh 2014 18157 380 2.34
18 MK673592.1 Bangladesh 2014 18234 380 2.39
19 MK673590.1 Bangladesh 2014 18208 383 2.34
20 MK673584.1 Bangladesh 2015 18229 385 2.29
21 MK673585.1 Bangladesh 2015 18245 381 2.39
22 MH523642.1 India 2018 18242 390 2.37
23 MH396625.1 India 2018 18210 389 2.43
24 MH523640.1 India 2018 18132 390 2.42
25 MH523641.1 India 2018 18027 388 2.43
26 PP981674.1 Bangladesh 2022 18077 381 2.45
27 PP981675.1 Bangladesh 2022 18096 381 2.34
28 PP981676.1 Bangladesh 2023 18092 378 2.33
29 PP981683.1 Bangladesh 2023 18090 382 2.33
30 PQ368168.1 Bangladesh 2023 18090 381 2.33
31 PV132707.1 Bangladesh 2024 18128 381 2.33
32 PX130166.1 Bangladesh 2025 18128 381 2.33

References

  1. Nipah virus fact sheet - World Health Organization (WHO). Available online: https://www.who.int/news-room/fact-sheets/detail/nipah-virus Seen on April 2, 2026.
  2. Madhukalya, D.; et al. Nipah virus: pathogenesis, genome, diagnosis, and treatment. Appl Microbiol Biotechnol. 2025, 109, 158. [Google Scholar] [CrossRef] [PubMed]
  3. Asokan, S.; et al. Nipah virus as a pandemic threat: Current knowledge, diagnostic gaps, and future research priorities. Diagnostic Microbiol Infect Dis. 2025, 114, 117–141. [Google Scholar] [CrossRef] [PubMed]
  4. Branda, F.; et al. Nipah virus: A zoonotic threat re-emerging in the wake of global public health challenges. Microorganisms 2025, 13, 124. [Google Scholar] [CrossRef] [PubMed]
  5. Tan, FH; et al. A systematic review on Nipah virus: global molecular epidemiology and medical development countermeasures. Virus Evol. 2024, 10, veae048. [Google Scholar] [CrossRef] [PubMed]
  6. Whitmer, S.; et al. Inference of Nipah virus evolution, 1999–2015. Virus Evol. 2021, 7, veaa062. [Google Scholar] [CrossRef] [PubMed]
  7. Rahman, Md.; et al. Development of a culture-independent whole-genome sequencing of Nipah virus using the MinION Oxford Nanopore platform. Microbiol Spectrum 2025, 13, e02492-24. [Google Scholar] [CrossRef] [PubMed]
  8. Cortes-Azuero, O.; et al. The genetic diversity of Nipah virus across spatial scales. J Infect Dis. 2024, 230, e1235. [Google Scholar] [CrossRef] [PubMed]
  9. Houser, N.; et al. Evolution of Nipah virus infection: past, present, and future considerations. Trop. Med. Infect. Dis. 2021, 6, 24. [Google Scholar] [CrossRef] [PubMed]
  10. Sudeep, A.; et al. Detection of Nipah virus in Pteropus medius in 2019 outbreak from Ernakulam district, Kerala, India. BMC Infect Dis. 2021, 21, 162. [Google Scholar] [CrossRef] [PubMed]
  11. Lv, C.; et al. Vaccines and animal models of Nipah virus: current situation and future prospects. Vaccines 2025, 13, 608. [Google Scholar] [CrossRef] [PubMed]
  12. Faus-Cotino, J.; et al. Nipah virus: A multidimensional update. Viruses 2024, 16, 179. [Google Scholar] [CrossRef] [PubMed]
  13. Belinsky, A.; Kouzaev, G.A. DNA walks in virus genomics. JP J Biostatistics 2024, 24, 251–286. [Google Scholar] [CrossRef]
  14. Hamori, E.; Raskin, J. Curves, a novel method of representation of nucleotide series, especially suited for long DNA sequences. J Biol Chem. 1983, 258, 1318–1327. [Google Scholar] [CrossRef] [PubMed]
  15. Nandy, A.; et al. Characterization of the Zika virus genome - a bioinformatics study. Curr Comp Aided Drug Design 2016, 12, 87–97. [Google Scholar] [CrossRef] [PubMed]
  16. Bielińska-Wąż, D.; et al. Applications of 2D and 3D-dynamic representations of DNA/RNA sequences for description of genome sequences of viruses. Comb Chem High Throughput Screen. 2022, 25, 429–438. [Google Scholar] [CrossRef] [PubMed]
  17. Nielsen, C.; et al. Visualizing genomes: techniques and challenges. Nat Math Suppl. 2010, 7, 1–11. Available online: http://www.nature.com/doifinder/10.1038/nmeth.1422. [CrossRef] [PubMed]
  18. Belinsky, A.; Kouzaev, G.A. Visual and quantitative analyses of virus genomic sequences using a metric-based algorithm. WSEAS Trans Circ Syst. 2022, 21, 323–348. [Google Scholar] [CrossRef]
  19. Kouzaev, G.A. ATG walks in virus genomics. Proc. 2nd Int. Conf. Infectious Diseases and Applied Microbiology and Beneficial Microbes, Vienna, Austria, 25-26 Sept., 2025; p. 46. Available online: https://www.researchgate.net/publication/396045850_ATG_Walks_in_Virus_Genomics.
  20. Heather, J.; Chain, B. The sequence of sequencers: The history of sequencing DNA. Genomics 2016, 107, 1–8. [Google Scholar] [CrossRef] [PubMed]
  21. The Sequencing Buyer's Guide 8th Edition. Available online: https://frontlinegenomics.com/sequencing-buyers-guide-8th on April 2, 2026.
  22. Garbuglia, A.; et al. Nipah virus: An overview of the current status of diagnostics and their role in preparedness in endemic countries. Viruses 2023, 15, 2062. [Google Scholar] [CrossRef] [PubMed]
  23. Genbank ®. 2 April 2026. Available online: www.ncbi.nkm.nih.gov.genbank.
  24. Hall, B. Phylogenetic Trees Made Easy: A How-To Manual, 5th Edition ed; Sinauer Assoc., 2018. [Google Scholar]
  25. Zou, Y.; et al. Common methods for phylogenetic tree construction and their implementation in R. Bioeng. 2024, 11, 480. [Google Scholar] [CrossRef] [PubMed]
  26. FracLab 2.2. A Fractal Analysis Toolbox for Signal and Image Processing. Available online: https://project.inria.fr/fraclab/ Seen on 2026.03.25.
Figure 1. ATG walks for 32 NiV sequences (a) and its enlarged plot (b) with a mark on the trajectories (rows 22-25, Table A1, Appendix A) of Indian Kerala’s outbreak samples.
Figure 1. ATG walks for 32 NiV sequences (a) and its enlarged plot (b) with a mark on the trajectories (rows 22-25, Table A1, Appendix A) of Indian Kerala’s outbreak samples.
Preprints 206865 g001
Figure 2. Fractal dimension D F of inter-ATG distances calculated for 32 Nipah virus genomes.
Figure 2. Fractal dimension D F of inter-ATG distances calculated for 32 Nipah virus genomes.
Preprints 206865 g002
Table 1. ATG walk observation results for SARS-CoV-2, MERS-CoV, dengue, Ebola [19], and Nipah viruses.
Table 1. ATG walk observation results for SARS-CoV-2, MERS-CoV, dengue, Ebola [19], and Nipah viruses.
1 Virus name SARS-CoV-2 MERS-CoV dengue Ebola Nipah
2 Observation years 2020-08.2025 2012-2020 1974-2019 1976-2019 1999-2025
3 Number of studied genomes 108 20 27 15 32
4 δ y 1 , 32 , % 1.5 2 14 9 3.4 -6.2
5 D F 0.0105 0.0216 0.1944 0.1011 0.0529
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated