Submitted:
10 October 2024
Posted:
14 October 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Materials and Methods
2.1. Creation of Personalized Multigene Panels Based on ROH
- Clean up the ROHMMCLI BED file to contain only chromosome, start and end positions.
- Merge the HM and the cleaned ROHMMCLI BED files, using bedtools merge with option -d of 1000000 bp, the maximum distance between ROHs to be merged.
- Use bedtools intersect to find overlaps between the merged BED file and the coding sequence coordinates BED file, producing another BED file with the list of gene coordinates found within ROHs.
- Create a text file with a list of gene Entrez IDs present in the identified ROHs.
- Find the CNV results for the sample in analysis and filter by CNVs with a span above 500,000 bp and that are ‘Heterozygous Deletion’, resulting in a BED file with CNV’s genomic coordinates.
- Filter by non-empty files, meaning files that contain CNVs.
- The shell script uses bedtools jaccard tool to calculate the Jaccard index for each CNV that intersects an ROH, using the merged ROH results and CNV BED files.
2.2. Creation of Personalized Multigene Panels Based on HPO Terms
2.3. Creation of Personalized Multigene Panels Based on ROH and HPO Terms
2.4. Django Web Application Development
2.5. Establishing the First Portuguese ROH Characterization on a Genomic Scale
- One containing ROH > 0.5 Mb;
- One containing ROH > 1.5 Mb;
- One containing ROH > 5 Mb.
2.6. Consanguinity Classification Approach
2.6.1. Feature Extraction
- Count_x: the number of ROH in chromosome x.
- Sum_x: the sum of ROH sizes in chromosome x.
- Min_x: the minimum ROH size in chromosome x.
- Max_x: the maximum of ROH size in chromosome x.
- Mean_x: the mean of ROH in chromosome x.
- STD_x: the standard deviation of ROH size in chromosome x.
- Tier 0: includes "Count_x" and "Sum_x" features only;
- Tier 1: includes "Count_x," "Sum_x," "Min_x," and "Max_x" features only;
- Tier 2: includes "Count_x," "Sum_x," "Min_x," "Max_x," "Mean_x," and "STD_x" features.
2.6.2. Outlier Detection
3. Results
3.1. Personalized Multigene Panels
- The creation of 15 multigene panels based on a single HPO term: HP:0001627 (Abnormal heart morphology), HP:0001047 (Atopic dermatitis), HP:0005584 (Renal cell carcinoma), HP:0001789 (Hydrops fetalis), HP:0011842 (Abnormal skeletal morphology), HP:0000846 (Adrenal insufficiency), HP:0003155 (Elevated circulating alkaline phosphatase concentration), HP:0000548 (Cone/cone-rod dystrophy), HP:0011510 (Drusen), HP:0000365 (Hearing impairment), HP:0000925 (Abnormality of the vertebral column), HP:0001949 (Neoplasm of the gastrointestinal tract), HP:0007373 (Motor neuron atrophy), HP:0006530 (Abnormal pulmonary interstitial morphology), HP:0012211 (Abnormal renal physiology), HP:0001733 (Pancreatitis), HP:0000556 (Retinal dystrophy);
- The creation of 3 multigene panels based on multiple HPO terms: HP:0000077 (Abnormality of the kidney), HP:0100243 (Leiomyosarcoma) and HP:0100522 (Thymoma); HP:0100574 (Biliary tract neoplasm) and HP:0003003 (Colon cancer); and HP:0003198 (Myopathy) and HP:0003473 (Fatigable weakness);
- The creation of 5 personalized multigene panels based on a single HPO previously manually prepared and curated: HP:0000126 (Hydronephrosis), HP:0001250 (Seizure), HP:0010566 (Hamartoma), HP:0012091 (Abnormality of pancreas physiology term) and HP:0012114 (Endometrial carcinoma), and comparison with the obtained results.
3.1.1. Output Obtained for Each Multigene Panel
3.1.2. Application of New Bioinformatic Resources in a Clinical Case
3.2. First Portuguese ROH Characterization on a Genomic Scale
3.2.1. Distribution of ROHs per Length in Portugal
3.2.2. Maps of Portugal and Respective Data for FROH > 0.5, 1.5 and 5 Mb
3.2.3. Comparison with Other Studies
3.3. Consanguinity Classification Results
4. Discussion
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Magi, A., Tattini, L., Palombo, F., Benelli, M., Gialluisi, A., Giusti, B., Abbate, R., Seri, M., Gensini, G. F. ranco, Romeo, G., & Pippucci, T. (2014). H3M2: detection of runs of homozygosity from whole-exome sequencing data. Bioinformatics (Oxford, England), 30(20), 2852–2859. https://doi.org/10.1093/bioinformatics/btu401. [CrossRef]
- Oliveira, J., Pereira, R., Santos, R., & Sousa, M. (2018). Evaluating runs of homozygosity in exome sequencing data - Utility in disease inheritance model selection and variant filtering. Communications in Computer and Information Science, 881, 268–288. https://doi.org/10.1007/978-3-319-94806-5_15. [CrossRef]
- Peripolli, E., Munari, D. P., Silva, M. V. G. B., Lima, A. L. F., Irgang, R., & Baldi, F. (2017). Runs of homozygosity: current knowledge and applications in livestock. In Animal Genetics (Vol. 48, Issue 3, pp. 255–271). Blackwell Publishing Ltd. https://doi.org/10.1111/age.12526. [CrossRef]
- Oniya, O., Neves, K., Ahmed, B., & Konje, J. C. (2019). A review of the reproductive consequences of consanguinity. European Journal of Obstetrics and Gynecology and Reproductive Biology, 232, 87–96. https://doi.org/10.1016/j.ejogrb.2018.10.042. [CrossRef]
- Marchi, N., Mennecier, P., Georges, M., Lafosse, S., Hegay, T., Dorzhu, C., Chichlo, B., Ségurel, L., & Heyer, E. (2018). Close inbreeding and low genetic diversity in Inner Asian human populations despite geographical exogamy. Scientific Reports, 8(1), 1–10. https://doi.org/10.1038/s41598-018-27047-3. [CrossRef]
- Yengo, L., Wray, N. R., & Visscher, P. M. (2019). Extreme inbreeding in a European ancestry sample from the contemporary UK population. Nature Communications, 10(1). https://doi.org/10.1038/s41467-019-11724-6. [CrossRef]
- Slatkin, M. (2004). A Population-Genetic Test of Founder Effects and Implications for Ashkenazi Jewish Diseases. In Am. J. Hum. Genet (Vol. 75). https://doi.org/10.1086/423146. [CrossRef]
- Dong, J.-T. (2001). Chromosomal deletions and tumor suppressor genes in prostate cancer. In Cancer and Metastasis Reviews (Vol. 20). https://doi.org/10.1023/A:1015575125780. [CrossRef]
- Nalls, M. A., Simon-Sanchez, J., Gibbs, J. R., Paisan-Ruiz, C., Bras, J. T., Tanaka, T., Matarin, M., Scholz, S., Weitz, C., Harris, T. B., Ferrucci, L., Hardy, J., & Singleton, A. B. (2009). Measures of autozygosity in decline: Globalization, urbanization, and its implications for medical genetics. PLoS Genetics, 5(3). https://doi.org/10.1371/journal.pgen.1000415. [CrossRef]
- Ceballos, F. C., Hazelhurst, S., & Ramsay, M. (2019). Runs of homozygosity in sub-Saharan African populations provide insights into complex demographic histories. Human Genetics, 138(10), 1123–1142. https://doi.org/10.1007/s00439-019-02045-1. [CrossRef]
- Lemes, R. B., Nunes, K., Carnavalli, J. E. P., Kimura, L., Mingroni-Netto, R. C., Meyer, D., & Otto, P. A. (2018). Inbreeding estimates in human populations: Applying new approaches to an admixed Brazilian isolate. PLoS ONE, 13(4). https://doi.org/10.1371/journal.pone.0196360. [CrossRef]
- Ben Halim, N., Nagara, M., Regnault, B., Hsouna, S., Lasram, K., Kefi, R., Azaiez, H., Khemira, L., Saidane, R., Ammar, S. Ben, Besbes, G., Weil, D., Petit, C., Abdelhak, S., & Romdhane, L. (2015). Estimation of Recent and Ancient Inbreeding in a Small Endogamous Tunisian Community Through Genomic Runs of Homozygosity. Annals of Human Genetics, 79(6), 402–417. https://doi.org/10.1111/ahg.12131. [CrossRef]
- Kang, J. T. L., Goldberg, A., Edge, M. D., Behar, D. M., & Rosenberg, N. A. (2017). Consanguinity Rates Predict Long Runs of Homozygosity in Jewish Populations. Human Heredity, 82(3–4), 87–102. https://doi.org/10.1159/000478897. [CrossRef]
- Pemberton, T. J., Absher, D., Feldman, M. W., Myers, R. M., Rosenberg, N. A., & Li, J. Z. (2012). Genomic patterns of homozygosity in worldwide human populations. American Journal of Human Genetics, 91(2), 275–292. https://doi.org/10.1016/j.ajhg.2012.06.014. [CrossRef]
- Kirin, M., Mcquillan, R., Franklin, C. S., Campbell, H., & Mckeigue, P. M. (2010). Genomic Runs of Homozygosity Record Population History and Consanguinity. PLoS ONE, 5(11), 13996. https://doi.org/10.1371/journal.pone.0013996. [CrossRef]
- Hunter-Zinck, H., Musharoff, S., Salit, J., Al-Ali, K. A., Chouchane, L., Gohar, A., Matthews, R., Butler, M. W., Fuller, J., Hackett, N. R., Crystal, R. G., & Clark, A. G. (2010). Population genetic structure of the people of Qatar. American Journal of Human Genetics, 87(1), 17–25. https://doi.org/10.1016/j.ajhg.2010.05.018. [CrossRef]
- Mezzavilla, M., Cocca, M., Maisano Delser, P., Badii, R., Abbaszadeh, F., Hadi, K. A., Giorgia, G., & Gasparini, P. (2022). Ancestry-related distribution of Runs of homozygosity and functional variants in Qatari population. BMC Genomic Data, 23(1). https://doi.org/10.1186/s12863-022-01087-1. [CrossRef]
- Scott, E. M., Halees, A., Itan, Y., Spencer, E. G., He, Y., Azab, M. A., Gabriel, S. B., Belkadi, A., Boisson, B., Abel, L., Clark, A. G., Rahim, S. A., Abdel-Hadi, S., Abdel-Salam, G., Abdel-Salam, E., Abdou, M., Abhytankar, A., Adimi, P., Ahmad, J., … Zhang, S. Y. (2016). Characterization of Greater Middle Eastern genetic variation for enhanced disease gene discovery. Nature Genetics, 48(9), 1071. https://doi.org/10.1038/NG.3592. [CrossRef]
- Yang, X., Al-Bustan, S., Feng, Q., Guo, W., Ma, Z., Marafie, M., Jacob, S., Al-Mulla, F., & Xu, S. (2014). The influence of admixture and consanguinity on population genetic diversity in Middle East. Journal of Human Genetics, 59(11), 615–622. https://doi.org/10.1038/jhg.2014.81. [CrossRef]
- Ceballos, F. C., Gürün, K., Altınışık, N. E., Gemici, H. C., Karamurat, C., Koptekin, D., Vural, K. B., Mapelli, I., Sağlıcan, E., Sürer, E., Erdal, Y. S., Götherström, A., Özer, F., Atakuman, Ç., & Somel, M. (2021). Human inbreeding has decreased in time through the Holocene. Current Biology, 31(17), 3925-3934.e8. https://doi.org/10.1016/j.cub.2021.06.027. [CrossRef]
- Ece Kars, M., Nazlı Bas, A., Emre Onat, O., Bilguvar, K., Choi, J., Itan, Y., Ça, C., Palvadeau, R., Casanova, J.-L., Cooper, D. N., Stenson, P. D., Yavuz, A., Bulus, H., Günel, M., Friedman, J. M., & Özçelik, T. (n.d.). The genetic structure of the Turkish population reveals high levels of variation and admixture. https://doi.org/10.1073/pnas.2026076118/-/DCSupplemental. [CrossRef]
- Binzer, S., Imrell, K., Binzer, M., Kyvik, K. O., Hillert, J., & Stenager, E. (2015). High inbreeding in the Faroe Islands does not appear to constitute a risk factor for multiple sclerosis. Multiple Sclerosis, 21(8), 996–1002. https://doi.org/10.1177/1352458514557305. [CrossRef]
- Karafet, T. M., Bulayeva, K. B., Bulayev, O. A., Gurgenova, F., Omarova, J., Yepiskoposyan, L., Savina, O. V., Veeramah, K. R., & Hammer, M. F. (2015). Extensive genome-wide autozygosity in the population isolates of Daghestan. European Journal of Human Genetics, 23(10), 1405–1412. https://doi.org/10.1038/ejhg.2014.299. [CrossRef]
- McLaughlin, R. L., Kenna, K. P., Vajda, A., Heverin, M., Byrne, S., Donaghy, C. G., Cronin, S., Bradley, D. G., & Hardiman, O. (2015). Homozygosity mapping in an Irish ALS case-control cohort describes local demographic phenomena and points towards potential recessive risk loci. Genomics, 105(4), 237–241. https://doi.org/10.1016/j.ygeno.2015.01.002. [CrossRef]
- Alabdullatif, M. A., Al Dhaibani, M. A., Khassawneh, M. Y., & El-Hattab, A. W. (2017). Chromosomal microarray in a highly consanguineous population: diagnostic yield, utility of regions of homozygosity, and novel mutations. Clinical Genetics, 91(4), 616–622. https://doi.org/10.1111/cge.12872. [CrossRef]
- Wang, J. C., Ross, L., Mahon, L. W., Owen, R., Hemmat, M., Wang, B. T., El Naggar, M., Kopita, K. A., Randolph, L. M., Chase, J. M., Aguilera, M. J. M., Siles, J. L., Church, J. A., Hauser, N., Shen, J. J., Jones, M. C., Wierenga, K. J., Jiang, Z., Haddadin, M., … Sahoo, T. (2015). Regions of homozygosity identified by oligonucleotide SNP arrays: Evaluating the incidence and clinical utility. European Journal of Human Genetics, 23(5), 663–671. https://doi.org/10.1038/ejhg.2014.153. [CrossRef]
- Prasad, A., Sdano, M. A., Vanzo, R. J., Mowery-Rushton, P. A., Serrano, M. A., Hensel, C. H., & Wassman, E. R. (2018). Clinical utility of exome sequencing in individuals with large homozygous regions detected by chromosomal microarray analysis. BMC Medical Genetics, 19(1). https://doi.org/10.1186/s12881-018-0555-3. [CrossRef]
- Hengel, H., Buchert, R., Sturm, M., Haack, T. B., Schelling, Y., Mahajnah, M., Sharkia, R., Azem, A., Balousha, G., Ghanem, Z., Falana, M., Balousha, O., Ayesh, S., Keimer, R., Deigendesch, W., Zaidan, J., Marzouqa, H., Bauer, P., & Schöls, L. (2020). First-line exome sequencing in Palestinian and Israeli Arabs with neurological disorders is efficient and facilitates disease gene discovery. European Journal of Human Genetics, 28(8), 1034–1043. https://doi.org/10.1038/s41431-020-0609-9. [CrossRef]
- Palombo, F., Graziano, C., Al Wardy, N., Nouri, N., Marconi, C., Magini, P., Severi, G., La Morgia, C., Cantalupo, G., Cordelli, D. M., Gangarossa, S., Al Kindi, M. N., Al Khabouri, M., Salehi, M., Giorgio, E., Brusco, A., Pisani, F., Romeo, G., Carelli, V., … Seri, M. (2020). Autozygosity-driven genetic diagnosis in consanguineous families from Italy and the Greater Middle East. Human Genetics, 139(11), 1429–1441. https://doi.org/10.1007/s00439-020-02187-7. [CrossRef]
- Knopp, C., Rudnik-Schöneborn, S., Eggermann, T., Bergmann, C., Begemann, M., Schoner, K., Zerres, K., & Ortiz Brüchle, N. (2015). Syndromic ciliopathies: From single gene to multi gene analysis by SNP arrays and next generation sequencing. Molecular and Cellular Probes, 29(5), 299–307. https://doi.org/10.1016/j.mcp.2015.05.008. [CrossRef]
- de Farias, A. A., Nunes, K., Lemes, R. B., Moura, R., Fernandes, G. R., Melo, U. S., Zatz, M., Kok, F., & Santos, S. (2018). Origin and age of the causative mutations in KLC2, IMPA1, MED25 and WNT7A unravelled through Brazilian admixed populations. Scientific Reports, 8(1). https://doi.org/10.1038/s41598-018-35022-1. [CrossRef]
- Wakil, S. M., Ramzan, K., Abuthuraya, R., Hagos, S., Al-Dossari, H., Al-Omar, R., Murad, H., Chedrawi, A., Al-Hassnan, Z. N., Finsterer, J., & Bohlega, S. (2014). Infantile-onset ascending hereditary spastic paraplegia with bulbar involvement due to the novel ALS2 mutation c.2761C>T. Gene, 536(1), 217–220. https://doi.org/10.1016/j.gene.2013.11.043. [CrossRef]
- Lobo-Prada, T., Sticht, H., Bogantes-Ledezma, S., Ekici, A., Uebe, S., Reis, A., & Leal, A. (2017). A homozygous mutation in GPT2 associated with nonsyndromic intellectual disability in a consanguineous family from costa rica. In JIMD Reports (Vol. 36, pp. 59–66). Springer. https://doi.org/10.1007/8904_2016_40. [CrossRef]
- Guo, T., Tan, Z. P., Chen, H. M., Zheng, D. yuan, liu, L., Huang, X. G., Chen, P., Luo, H., & Yang, Y. F. (2017). An effective combination of whole-exome sequencing and runs of homozygosity for the diagnosis of primary ciliary dyskinesia in consanguineous families. Scientific Reports, 7(1). https://doi.org/10.1038/s41598-017-08510-z. [CrossRef]
- Costa, P., Zanus, C., Faletra, F., Ventura, G., di Marzio, G. M., Cervesi, C., & Carrozzi, M. (2019). Epileptic encephalopathy with microcephaly in a patient with asparagine synthetase deficiency: a video-EEG report∗. Epileptic Disorders, 21(5), 466–470. https://doi.org/10.1684/epd.2019.1100. [CrossRef]
- Khan, R., Shabbir, R. M. K., Raza, I., Abdullah, U., Naeem, M. A., Ahmed, A., Malik, S., Hu, Z., & Xia, K. (2020). A founder RDH5 splice site mutation leads to retinitis punctata albescens in two inbred Pakistani kindreds. Ophthalmic Genetics, 41(1), 7–12. https://doi.org/10.1080/13816810.2019.1709124. [CrossRef]
- Yu, W., You, X., Wang, D., Dong, K., Su, J., Li, C., Liu, J., Zhang, Q., You, F., Wang, X., Huang, J., Qiao, B., & Duan, W. (2015). Microarray analysis unmasked two siblings with pure hereditary spastic paraplegia shared a run of homozygosity region on chromosome 3q28-q29. Journal of the Neurological Sciences, 359(1–2), 351–355. https://doi.org/10.1016/j.jns.2015.10.057. [CrossRef]
- Calderón, R., Hernández, C. L., García-Varela, G., Masciarelli, D., & Cuesta, P. (2018). Inbreeding in Southeastern Spain: The Impact of Geography and Demography on Marital Mobility and Marital Distance Patterns (1900–1969). Human Nature, 29(1), 45–64. https://doi.org/10.1007/s12110-017-9305-z. [CrossRef]
- Pippucci, T., Magi, A., Gialluisi, A., & Romeo, G. (2014). Detection of runs of homozygosity from whole exome sequencing data: State of the art and perspectives for clinical, population and epidemiological studies. Human Heredity, 77(1–4), 63–72. https://doi.org/10.1159/000362412. [CrossRef]
- Lander, E. S., & Botstein, D. (1987). Homozygosity Mapping: A Way to Map Human Recessive Traits with the DNA of Inbred Children. Science, 236(4808), 1567–1570. https://doi.org/10.1126/SCIENCE.2884728. [CrossRef]
- Hu, T., Chitnis, N., Monos, D., & Dinh, A. (2021). Next-generation sequencing technologies: An overview. Human Immunology, 82(11), 801–811. https://doi.org/10.1016/j.humimm.2021.02.012. [CrossRef]
- Pereira, R., Oliveira, J., & Sousa, M. (2020). Bioinformatics and computational tools for next-generation sequencing analysis in clinical genetics. In Journal of Clinical Medicine (Vol. 9, Issue 1). MDPI. https://doi.org/10.3390/jcm9010132. [CrossRef]
- Thompson, J. F., & Milos, P. M. (2011). The properties and applications of single-molecule DNA sequencing. Genome Biology, 12(2), 1–10. https://doi.org/10.1186/GB-2011-12-2-217/TABLES/1. [CrossRef]
- Rhoads, A., & Au, K. F. (2015). PacBio Sequencing and Its Applications. https://doi.org/10.1016/j.gpb.2015.08.002. [CrossRef]
- Zhang, L., Chen, F. X., Zeng, Z., Xu, M., Sun, F., Yang, L., Bi, X., Lin, Y., Gao, Y. J., Hao, H. X., Yi, W., Li, M., & Xie, Y. (2021). Advances in Metagenomics and Its Application in Environmental Microorganisms. Frontiers in Microbiology, 12, 766364. https://doi.org/10.3389/FMICB.2021.766364. [CrossRef]
- Qin, D. (2019). Next-generation sequencing and its clinical application. Cancer Biology and Medicine, 16(1), 4–10. https://doi.org/10.20892/j.issn.2095-3941.2018.0055. [CrossRef]
- Barbitoff, Y. A., Polev, D. E., Glotov, A. S., Serebryakova, E. A., Shcherbakova, I. V., Kiselev, A. M., Kostareva, A. A., Glotov, O. S., & Predeus, A. V. (2020). Systematic dissection of biases in whole-exome and whole-genome sequencing reveals major determinants of coding sequence coverage. Scientific Reports, 10(1). https://doi.org/10.1038/s41598-020-59026-y. [CrossRef]
- Choi, M., Scholl, U. I., Ji, W., Liu, T., Tikhonova, I. R., Zumbo, P., Nayir, A., Bakkaloğlu, A., Ozen, S., Sanjad, S., Nelson-Williams, C., Farhi, A., Mane, S., & Lifton, R. P. (2009). Genetic diagnosis by whole exome capture and massively parallel DNA sequencing. Proceedings of the National Academy of Sciences of the United States of America, 106(45), 19096–19101. https://doi.org/10.1073/pnas.0910672106. [CrossRef]
- Bartha, Á., & Győrffy, B. (2019). Comprehensive outline of whole exome sequencing data analysis tools available in clinical oncology. In Cancers (Vol. 11, Issue 11). MDPI AG. https://doi.org/10.3390/cancers11111725. [CrossRef]
- Warman Chardon, J., Beaulieu, C., Hartley, T., Boycott, K. M., & Dyment, D. A. (2015). Axons to Exons: the Molecular Diagnosis of Rare Neurological Diseases by Next-Generation Sequencing. Current Neurology and Neuroscience Reports, 15(9), 1–8. https://doi.org/10.1007/S11910-015-0584-7/TABLES/2. [CrossRef]
- Gargano, M. A., Matentzoglu, N., Coleman, B., Addo-Lartey, E. B., Anagnostopoulos, A. V., Anderton, J., Avillach, P., Bagley, A. M., Bakštein, E., Balhoff, J. P., Baynam, G., Bello, S. M., Berk, M., Bertram, H., Bishop, S., Blau, H., Bodenstein, D. F., Botas, P., Boztug, K., … Robinson, P. N. (2024). The Human Phenotype Ontology in 2024: phenotypes around the world. Nucleic Acids Research, 52(D1), D1333–D1346. https://doi.org/10.1093/nar/gkad1005. [CrossRef]
- Bullich, G., Matalonga, L., Pujadas, M., Papakonstantinou, A., Piscia, D., Tonda, R., Artuch, R., Gallano, P., Garrabou, G., González, J. R., Grinberg, D., Guitart, M., Laurie, S., Lázaro, C., Luengo, C., Martí, R., Milà, M., Ovelleiro, D., Parra, G., … Vendrell, T. (2022). Systematic Collaborative Reanalysis of Genomic Data Improves Diagnostic Yield in Neurologic Rare Diseases. Journal of Molecular Diagnostics, 24(5), 529–542. https://doi.org/10.1016/j.jmoldx.2022.02.003. [CrossRef]
- Matalonga, L., Laurie, S., Papakonstantinou, A., Piscia, D., Mereu, E., Bullich, G., Thompson, R., Horvath, R., Pérez-Jurado, L., Riess, O., Gut, I., van Ommen, G. J., Lochmüller, H., Beltran, S., Renieri, A., Dursun, A., Matilla-Duenas, A., Cormand, B., Rivolta, C., … Sabater, M. (2020). Improved Diagnosis of Rare Disease Patients through Systematic Detection of Runs of Homozygosity. Journal of Molecular Diagnostics, 22(9), 1205–1215. https://doi.org/10.1016/j.jmoldx.2020.06.008. [CrossRef]
- Becker, J., Semler, O., Gilissen, C., Li, Y., Bolz, H. J., Giunta, C., Bergmann, C., Rohrbach, M., Koerber, F., Zimmermann, K., De Vries, P., Wirth, B., Schoenau, E., Wollnik, B., Veltman, J. A., Hoischen, A., & Netzer, C. (2011). Exome sequencing identifies truncating mutations in human SERPINF1 in autosomal-recessive osteogenesis imperfecta. American Journal of Human Genetics, 88(3), 362–371. https://doi.org/10.1016/j.ajhg.2011.01.015. [CrossRef]
- Mezzavilla, M., Vozzi, D., Badii, R., Khalifa Alkowari, M., Abdulhadi, K., Girotto, G., & Gasparini, P. (2015). Increased rate of deleterious variants in long runs of homozygosity of an inbred population from Qatar. Human Heredity, 79(1), 14–19. https://doi.org/10.1159/000371387. [CrossRef]
- Yang, T. L., Guo, Y., Zhang, L. S., Tian, Q., Yan, H., Papasian, C. J., Recker, R. R., & Deng, H. W. (2010). Runs of homozygosity identify a recessive locus 12q21.31 for human adult height. Journal of Clinical Endocrinology and Metabolism, 95(8), 3777–3782. https://doi.org/10.1210/jc.2009-1715. [CrossRef]
- Wang, L. S., Hranilovic, D., Wang, K., Lindquist, I. E., Yurcaba, L., Petkovic, Z. B., Gidaya, N., Jernej, B., Hakonarson, H., & Bucan, M. (2010). Population-based study of genetic variation in individuals with autism spectrum disorders from Croatia. BMC Medical Genetics, 11(1), 134. https://doi.org/10.1186/1471-2350-11-134. [CrossRef]
- Gross, A., Tönjes, A., Kovacs, P., Veeramah, K. R., Ahnert, P., Roshyara, N. R., Gieger, C., Rueckert, I. M., Loeffler, M., Stoneking, M., Wichmann, H. E., Novembre, J., Stumvoll, M., & Scholz, M. (2011). Population-genetic comparison of the Sorbian isolate population in Germany with the German KORA population using genome-wide SNP arrays. BMC Genetics, 12. https://doi.org/10.1186/1471-2156-12-67. [CrossRef]
- Ghani, M., Sato, C., Lee, J. H., Reitz, C., Moreno, D., Mayeux, R., George-Hyslop, P. S., & Rogaeva, E. (2013). Evidence of recessive Alzheimer disease loci in a Caribbean Hispanic data set: genome-wide survey of runs of homozygosity. JAMA Neurology, 70(10), 1261–1267. https://doi.org/10.1001/JAMANEUROL.2013.3545. [CrossRef]
- Yang, T. L., Guo, Y., Zhang, J. G., Xu, C., Tian, Q., & Deng, H. W. (2015). Genome-wide Survey of Runs of Homozygosity Identifies Recessive Loci for Bone Mineral Density in Caucasian and Chinese Populations. Journal of Bone and Mineral Research : The Official Journal of the American Society for Bone and Mineral Research, 30(11), 2119–2126. https://doi.org/10.1002/JBMR.2558. [CrossRef]
- Ghani, M., Reitz, C., Cheng, R., Vardarajan, B. N., Jun, G., Sato, C., Naj, A., Rajbhandary, R., Wang, L. S., Valladares, O., Lin, C. F., Larson, E. B., Graff-Radford, N. R., Evans, D., De Jager, P. L., Crane, P. K., Buxbaum, J. D., Murrell, J. R., Raj, T., … Yu, L. (2015). Association of Long Runs of Homozygosity With Alzheimer Disease Among African American Individuals. JAMA Neurology, 72(11), 1313–1323. https://doi.org/10.1001/JAMANEUROL.2015.1700. [CrossRef]
- Bandrés-Ciga, S., Price, T. R., Barrero, F. J., Escamilla-Sevilla, F., Pelegrina, J., Arepalli, S., Hernández, D., Gutiérrez, B., Cervilla, J., Rivera, M., Rivera, A., Ding, J. hui, Vives, F., Nalls, M., Singleton, A., & Durán, R. (2016). Genome-wide assessment of Parkinson’s disease in a Southern Spanish population. Neurobiology of Aging, 45, 213.e3. https://doi.org/10.1016/J.NEUROBIOLAGING.2016.06.001. [CrossRef]
- Barbieri, C., Barquera, R., Arias, L., Sandoval, J. R., Acosta, O., Zurita, C., Aguilar-Campos, A., Tito-Álvarez, A. M., Serrano-Osuna, R., Gray, R. D., Mafessoni, F., Heggarty, P., Shimizu, K. K., Fujita, R., Stoneking, M., Pugach, I., & Fehren-Schmitz, L. (2019). The Current Genomic Landscape of Western South America: Andes, Amazonia, and Pacific Coast. Molecular Biology and Evolution, 36(12), 2698–2713. https://doi.org/10.1093/MOLBEV/MSZ174. [CrossRef]
- Font-Porterias, N., Caro-Consuegra, R., Lucas-Sánchez, M., Lopez, M., Giménez, A., Carballo-Mesa, A., Bosch, E., Calafell, F., Quintana-Murci, L., & Comas, D. (2021). The Counteracting Effects of Demography on Functional Genomic Variation: The Roma Paradigm. Molecular Biology and Evolution, 38(7), 2804–2817. https://doi.org/10.1093/MOLBEV/MSAB070. [CrossRef]
- Cruz, P. R. S. da, Ananina, G., Secolin, R., Gil-Da-Silva-Lopes, V. L., Lima, C. S. P., França, P. H. C. de, Donatti, A., Lourenço, G. J., Araujo, T. K. de, Simioni, M., Lopes-Cendes, I., Costa, F. F., & Melo, M. B. de. (2022). Demographic history differences between Hispanics and Brazilians imprint haplotype features. G3 (Bethesda, Md.), 12(7). https://doi.org/10.1093/G3JOURNAL/JKAC111. [CrossRef]
- Ruan, X., Kocher, J. P. A., Pommier, Y., Liu, H., & Reinhold, W. C. (2012). Mass homozygotes accumulation in the NCI-60 cancer cell lines as compared to HapMap Trios, and relation to fragile site location. PloS One, 7(2). https://doi.org/10.1371/JOURNAL.PONE.0031628. [CrossRef]
- Santoni, F. A., Makrythanasis, P., & Antonarakis, S. E. (2015). CATCHing putative causative variants in consanguineous families. BMC Bioinformatics, 16(1). https://doi.org/10.1186/S12859-015-0727-5. [CrossRef]
- Sonehara, K., & Okada, Y. (2020). Obelisc: an identical-by-descent mapping tool based on SNP streak. Bioinformatics, 36(24), 5567. https://doi.org/10.1093/BIOINFORMATICS/BTAA940. [CrossRef]
- Garone, C., Pippucci, T., Cordelli, D. M., Zuntini, R., Castegnaro, G., Marconi, C., Graziano, C., Marchiani, V., Verrotti, A., Seri, M., & Franzoni, E. (2011). FA2H-related disorders: A novel c.270+3A>T splice-site mutation leads to a complex neurodegenerative phenotype. Developmental Medicine and Child Neurology, 53(10), 958–961. https://doi.org/10.1111/j.1469-8749.2011.03993.x. [CrossRef]
- Seelow, D., & Schuelke, M. (2012). HomozygosityMapper2012-bridging the gap between homozygosity mapping and deep sequencing. Nucleic Acids Research, 40(W1). https://doi.org/10.1093/nar/gks487. [CrossRef]
- Seelow, D., Schuelke, M., Hildebrandt, F., & Nürnberg, P. (2009). HomozygosityMapper - An interactive approach to homozygosity mapping. Nucleic Acids Research, 37(SUPPL. 2). https://doi.org/10.1093/nar/gkp369. [CrossRef]
- Kancheva, D., Atkinson, D., De Rijk, P., Zimon, M., Chamova, T., Mitev, V., Yaramis, A., Maria Fabrizi, G., Topaloglu, H., Tournev, I., Parma, Y., Battaloglu, E., Estrada-Cuzcano, A., & Jordanova, A. (2016). Novel mutations in genes causing hereditary spastic paraplegia and Charcot-Marie-Tooth neuropathy identified by an optimized protocol for homozygosity mapping based on whole-exome sequencing. Genetics in Medicine, 18(6), 600–607. https://doi.org/10.1038/GIM.2015.139. [CrossRef]
- Szpiech, Z. A., Blant, A., & Pemberton, T. J. (2017). GARLIC: Genomic Autozygosity Regions Likelihood-based Inference and Classification. Bioinformatics (Oxford, England), 33(13), 2059–2062. https://doi.org/10.1093/BIOINFORMATICS/BTX102. [CrossRef]
- Görmez, Z., Bakir-Gungor, B., & Saǧiroǧlu, M. Ş. (2014). HomSI: a homozygous stretch identifier from next-generation sequencing data. Bioinformatics, 30(3), 445–447. https://doi.org/10.1093/BIOINFORMATICS/BTT686. [CrossRef]
- Quinodoz, M., Peter, V. G., Bedoni, N., Bertrand, B. R., Cisarova, K., Salmaninejad, A., Sepahi, N., Rodrigues, R., Piran, M., Mojarrad, M., Pasdar, A., Asad, A. G., Sousa, A. B., Santos, L. C., Superti-Furga, A., & Rivolta, C. (n.d.). AutoMap is a high performance homozygosity mapping tool using next-generation sequencing data. https://doi.org/10.1038/s41467-020-20584-4. [CrossRef]
- Yoon, B.-J. (2009). Hidden Markov Models and their Applications in Biological Sequence Analysis. Current Genomics, 10(6), 402. https://doi.org/10.2174/138920209789177575. [CrossRef]
- Narasimhan, V., Danecek, P., Scally, A., Xue, Y., Tyler-Smith, C., & Durbin, R. (2016). BCFtools/RoH: a hidden Markov model approach for detecting autozygosity from next-generation sequencing data. Bioinformatics, 32(11), 1749–1751. https://doi.org/10.1093/BIOINFORMATICS/BTW044. [CrossRef]
- Zhuang, Z., Gusev, A., Cho, J., & Pe’er, I. (2012). Detecting Identity by Descent and Homozygosity Mapping in Whole-Exome Sequencing Data. PLoS ONE, 7(10). https://doi.org/10.1371/journal.pone.0047618. [CrossRef]
- Browning, S. R., & Browning, B. L. (2010). High-Resolution Detection of Identity by Descent in Unrelated Individuals. American Journal of Human Genetics, 86(4), 526–539. https://doi.org/10.1016/j.ajhg.2010.02.021. [CrossRef]
- Çelik, G., & Tuncalı, T. (2022). ROHMM—A flexible hidden Markov model framework to detect runs of homozygosity from genotyping data. Human Mutation, 43(2), 158–168. https://doi.org/10.1002/HUMU.24316. [CrossRef]
- Vigeland, M. D., Gjøtterud, K. S., & Selmer, K. K. (2016). FILTUS: A desktop GUI for fast and efficient detection of disease-causing variants, including a novel autozygosity detector. Bioinformatics, 32(10), 1592–1594. https://doi.org/10.1093/BIOINFORMATICS/BTW046. [CrossRef]
- hapROH · PyPI. (n.d.). Retrieved March 27, 2023, from https://pypi.org/project/hapROH/.
- Ringbauer, H., Novembre, J., & Steinrücken, M. (2021). Parental relatedness through time revealed by runs of homozygosity in ancient DNA. Nature Communications, 12(1). https://doi.org/10.1038/S41467-021-25289-W. [CrossRef]
- Kruskal, J. B., & Hill, M. (1964). Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29, 1–27. https://doi.org/10.1007/BF0228956585. [CrossRef]
- Rousseeuw, P. J., & Van Driessen, K. (1999). A fast algorithm for the minimum covariance determinant estimator. Technometrics, 41(3), 212–223. https://doi.org/10.1080/00401706.1999.10485670. [CrossRef]
- Lalioti, M. D., Mirotsou, M., Buresi, C., Peitsch, M. C., Rossier, C., Ouazzani, R., Baldy-Moulinier, M., Bottani, A., Malafosse, A., & Antonarakis, S. E. (1997). Identification of mutations in cystatin B, the gene responsible for the Unverricht-Lundborg type of progressive myoclonus epilepsy (EPM1). American Journal of Human Genetics, 60(2), 342. /pmc/articles/PMC1712389/?report=abstract.
- McQuillan, R., Leutenegger, A. L., Abdel-Rahman, R., Franklin, C. S., Pericic, M., Barac-Lauc, L., Smolej-Narancic, N., Janicijevic, B., Polasek, O., Tenesa, A., MacLeod, A. K., Farrington, S. M., Rudan, P., Hayward, C., Vitart, V., Rudan, I., Wild, S. H., Dunlop, M. G., Wright, A. F., … Wilson, J. F. (2008). Runs of Homozygosity in European Populations. American Journal of Human Genetics, 83(3), 359. https://doi.org/10.1016/J.AJHG.2008.08.007. [CrossRef]
- Santos, H. G., Dias, J. A., Pimenta, Z. P., Homenagem Ao Professor, E., & Guignard, J. (n.d.). SUMÁRIO 41 INCIDÊNCIA DE CASAMENTOS CONSANGUINEOS NA POPULAÇÃO INCIDÊNCIA DE CASAMENTOS CONSANGUÍíNEOS NA POPULAÇÃO PORTUGUESA-1980-1986.
- Ceballos, F. C., Joshi, P. K., Clark, D. W., Ramsay, M., & Wilson, J. F. (2018). Runs of homozygosity: Windows into population history and trait architecture. In Nature Reviews Genetics (Vol. 19, Issue 4, pp. 220–234). Nature Publishing Group. https://doi.org/10.1038/nrg.2017.109. [CrossRef]
- Martin, A. R., Williams, E., Foulger, R. E., Leigh, S., Daugherty, L. C., Niblock, O., Leong, I. U. S., Smith, K. R., Gerasimenko, O., Haraldsdottir, E., Thomas, E., Scott, R. H., Baple, E., Tucci, A., Brittain, H., de Burca, A., Ibañez, K., Kasperaviciute, D., Smedley, D., … McDonagh, E. M. (2019). PanelApp crowdsources expert knowledge to establish consensus diagnostic gene panels. In Nature Genetics (Vol. 51, Issue 11, pp. 1560–1565). Nature Publishing Group. https://doi.org/10.1038/s41588-019-0528-2. [CrossRef]















| II:1 | II:2 |
|---|---|
| DHDDS | SIK1 |
| HMGCL | CSTB |
| MERC | SLC32A1 |
| SDHA | |
| SIK1 | |
| CSTB | |
| PIGV | |
| SLC25A19 | |
| SLC32A1 | |
| TERT | |
| TSEN54 |
| FROH > 0.5 Mb Intervals | Number of samples |
|---|---|
| ]0.000, 0.004] | 3,104 |
| ]0.004, 0.008] | 210 |
| ]0.008, 0.010] | 156 |
| ]0.010, 0.018] | 107 |
| ]0.018, 0.034] | 124 |
| ]0.034, 0.088] | 99 |
| FROH > 1.5 Mb Intervals | Number of samples |
|---|---|
| ]0.000, 0.003] | 1,430 |
| ]0.003, 0.005] | 196 |
| ]0.005, 0.009] | 162 |
| ]0.009, 0.017] | 107 |
| ]0.017, 0.033] | 126 |
| ]0.033, 0.085] | 89 |
| FROH > 5 Mb Intervals | Number of samples |
|---|---|
| ]0.000, 0.002] | 38 |
| ]0.002, 0.004] | 318 |
| ]0.004, 0.008] | 146 |
| ]0.008, 0.016] | 113 |
| ]0.016, 0.032] | 91 |
| ]0.032, 0.074] | 52 |
| Mean FROH | Mean FROH of means per municipality | FROH comparative values [87] | |
|---|---|---|---|
| FROH > 0.5 Mb | 0.0042 | 0.0057 | 0.0315 |
| FROH > 1.5 Mb | 0.0033 | 0.0049 | 0.0021 |
| FROH > 5 Mb | 0.0020 | 0.0039 | 0.0001 |
| District | FROH > 0.5 Mb | FROH > 1.5 Mb | FROH > 5.0 Mb | Number of consanguineous marriages (10 000) [88] |
|---|---|---|---|---|
| Açores | 0.0046 | 0.0035 | 0.0023 | 78.7 |
| Aveiro | 0.0052 | 0.0041 | 0.0024 | 22.1 |
| Beja | 0.0048 | 0.0039 | 0.0022 | 22.8 |
| Braga | 0.0034 | 0.0025 | 0.0013 | 19.2 |
| Bragança | 0.0102 | 0.0090 | 0.0060 | 52.7 |
| Castelo Branco | 0.0066 | 0.0054 | 0.0034 | 19.9 |
| Coimbra | 0.0063 | 0.0053 | 0.0036 | 38.2 |
| Évora | 0.0039 | 0.0028 | 0.0016 | 34.5 |
| Faro | 0.0029 | 0.0019 | 0.0010 | 27.2 |
| Guarda | 0.0048 | 0.0038 | 0.0024 | 35.3 |
| Leiria | 0.0058 | 0.0047 | 0.0030 | 35.1 |
| Lisboa | 0.0039 | 0.0030 | 0.0017 | 20.2 |
| Madeira | 0.0077 | 0.0068 | 0.0041 | 133.6 |
| Portalegre | 0.0106 | 0.0092 | 0.0070 | 24.8 |
| Porto | 0.0026 | 0.0018 | 0.0010 | 14.4 |
| Santarém | 0.0056 | 0.0045 | 0.0032 | 27.6 |
| Setúbal | 0.0038 | 0.0029 | 0.0019 | 30.1 |
| Viana do Castelo | 0.0030 | 0.0021 | 0.0010 | 17.8 |
| Vila Real | 0.0070 | 0.0060 | 0.0037 | 38.3 |
| Viseu | 0.0105 | 0.0091 | 0.0059 | 38.7 |
| Dataset Tier (Feature Set) | Best Contamination Hyperparameter | Validation F1-Score | Test F1-Score |
|---|---|---|---|
| Tier 0 (Count_x, Sum_x) | 0.0786 | 0.9310 | 0.9412 |
| Tier 1 (Count_x, Sum_x, Min_x, Max_x) | 0.1190 | 0.9655 | 0.9434 |
| Tier 2 (Count_x, Sum_x, Min_x, Max_x, Mean_x, STD_x) | 0.1061 | 0.9474 | 0.9615 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).