Submitted:
28 July 2023
Posted:
02 August 2023
You are already at the latest version
Abstract
Keywords:
MSC: 68T30
1. Introduction
- Edge-counting techniques [44] evaluate the semantic similarity by computing the number of edges and nodes separating two concepts (nodes) within the semantic representation structures. We defined the technique preferably for taxonomic relationships (edges and nodes) in a semantic network.
- Information content-based approaches assess the similitude applying a probabilistic model. It takes as input the concepts of an ontology and employs an information content function to determine their similarity values in the ontology [41,54,55]. The literature base the information content computation on the distribution of tagged concepts in the corpora. Obtaining information content from concepts consists of structured and formal methods based on knowledge discovery [31,56,57,58].
- Feature-based methods assess similitude values employing the whole conventional and non-conventional features by a weighted sum of these items [19,59]. Thus, Sánchez et al. (2012) [4] designed a model in which non-taxonomic and taxonomic relationships. Moreover, [34,60] proposed to use interpretations of concepts retrieved from a thesaurus. Then, the edge-counting techniques improve since the evaluation considers a semantic reinforcement. In contrast, they do not consider non-taxonomic properties because they rarely appear in an ontology [61] and demand a fined tunning of the weighting variables to merge diverse semantic reinforcements [60]. Additionally, the edge-counting techniques examine the similarity concerning the shortest path about the number of taxonomic links dividing two concepts into an ontology [42,44,62,63].
2. Related work
2.1. Semantic similarity
2.2. Information content computation
2.3. The Wikipedia corpus
- Articles. Wikipedia’s primary information unit is an article composed of free text following a detailed set of editorial and structural rules to ensure consistency and coherence. Each article covers a single concept, with a separate article for each. Article titles are concise sentences systematically arranged in a formal thesaurus. Wikipedia relies on collaborative efforts from its users to gather information.
- Referral pages are documents that contain nothing more than a direct link to a set of links. These pages redirect the request to the appropriate article page containing information about the object specified in the request. They lead to different phrases of an entity and thus model synonyms.
- Disambiguation pages collect links for various potential entities to which the original query could refer. These pages allow users to select the intended meaning. They serve as a mechanism for modeling homonymy.
- Hyperlinks are pointers to Wikipedia pages and serve as additional sources of synonyms, missed by the redirecting process. They eliminate ambiguity by coding polysemy. Articles related to other dictionaries and encyclopedias refer to them through resident hyperlinks, which refer to as a cross-referenced element model.
- The category structure in Wikipedia is a semantic web organized into groups (categories). Articles are assigned to one or more groups that are grouped together and subsequently organized into a “category tree”. This “tree” is not designed as a formal hierarchy but works simultaneously with different classification methods. Additionally, the tree is implemented as an acyclic-directed graph. Thus, categories serve as only organizational nodes with minimal explanatory content.
3. Methods and materials
3.1. The DIS-C algorithm for information content computation
3.2. Generality
3.3. Corpus used for the testing: Wikipedia and WordNet
4. Results and discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Harispe, S.; Sánchez, D.; Ranwez, S.; Janaqi, S.; Montmain, J. A framework for unifying ontology-based semantic similarity measures: A study in the biomedical domain. Journal of biomedical informatics 2014, 48, 38–53. [Google Scholar] [CrossRef] [PubMed]
- Goldstone, R.L. Similarity, interactive activation, and mapping. Journal of Experimental Psychology: Learning, Memory, and Cognition 1994, 20, 3. [Google Scholar] [CrossRef]
- Sánchez, D.; Batet, M. A semantic similarity method based on information content exploiting multiple ontologies. Expert Systems with Applications 2013, 40, 1393–1399. [Google Scholar] [CrossRef]
- Sánchez, D.; Solé-Ribalta, A.; Batet, M.; Serratosa, F. Enabling semantic similarity estimation across multiple ontologies: an evaluation in the biomedical domain. Journal of biomedical informatics 2012, 45, 141–155. [Google Scholar] [CrossRef]
- Rodríguez, M.; Egenhofer, M. Comparing geospatial entity classes: an asymmetric and context-dependent similarity measure. International Journal of Geographical Information Science 2004, 18, 229–256. [Google Scholar] [CrossRef]
- Schwering, A.; Raubal, M. Measuring semantic similarity between geospatial conceptual regions. In GeoSpatial Semantics; Springer, 2005; pp. 90–106. [Google Scholar]
- Wang, H.; Wang, W.; Yang, J.; Yu, P.S. Clustering by pattern similarity in large data sets. Proceedings of the 2002 ACM SIGMOD international conference on Management of data. ACM, 2002, pp. 394–405.
- Al-Mubaid, H.; Nguyen, H. ; others. A cluster-based approach for semantic similarity in the biomedical domain. Engineering in Medicine and Biology Society, 2006. EMBS’06. 28th Annual International Conference of the IEEE. IEEE, 2006, pp. 2713–2717.
- Al-Mubaid, H.; Nguyen, H.; others. Measuring semantic similarity between biomedical concepts within multiple ontologies. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on 2009, 39, 389–398. [Google Scholar] [CrossRef]
- Budan, I.; Graeme, H. Evaluating WordNet-Based Measures of Semantic Distance. Comutational Linguistics 2006, 32, 13–47. [Google Scholar]
- Hliaoutakis, A.; Varelas, G.; Voutsakis, E.; Petrakis, E.G.; Milios, E. Information retrieval by semantic similarity. International Journal on Semantic Web and Information Systems 2006, 2, 55–73. [Google Scholar] [CrossRef]
- Kumar, S.; Baliyan, N.; Sukalikar, S. Ontology Cohesion and Coupling Metrics. International Journal on Semantic Web and Information Systems (IJSWIS) 2017, 13, 1–26. [Google Scholar] [CrossRef]
- Pirrò, G.; Ruffolo, M.; Talia, D. SECCO: on building semantic links in Peer-to-Peer networks. In Journal on Data Semantics XII; Springer, 2009; pp. 1–36. [Google Scholar]
- Meilicke, C.; Stuckenschmidt, H.; Tamilin, A. Repairing ontology mappings. AAAI 2007, 3, 6. [Google Scholar]
- Tapeh, A.G.; Rahgozar, M. A knowledge-based question answering system for B2C eCommerce. Knowledge-Based Systems 2008, 21, 946–950. [Google Scholar] [CrossRef]
- Patwardhan, S.; Banerjee, S.; Pedersen, T. Using measures of semantic relatedness for word sense disambiguation. In Computational linguistics and intelligent text processing; Springer, 2003; pp. 241–257. [Google Scholar]
- Sinha, R.; Mihalcea, R. Unsupervised graph-basedword sense disambiguation using measures of word semantic similarity. null. IEEE, 2007, pp. 363–369.
- Blanco-Fernández, Y.; Pazos-Arias, J.J.; Gil-Solla, A.; Ramos-Cabrer, M.; López-Nores, M.; García-Duque, J.; Fernández-Vilas, A.; Díaz-Redondo, R.P.; Bermejo-Muñoz, J. A flexible semantic inference methodology to reason about user preferences in knowledge-based recommender systems. Knowledge-Based Systems 2008, 21, 305–320. [Google Scholar] [CrossRef]
- Likavec, S.; Osborne, F.; Cena, F. Property-based semantic similarity and relatedness for improving recommendation accuracy and diversity. International Journal on Semantic Web and Information Systems (IJSWIS) 2015, 11, 1–40. [Google Scholar] [CrossRef]
- Atkinson, J.; Ferreira, A.; Aravena, E. Discovering implicit intention-level knowledge from natural-language texts. Knowledge-Based Systems 2009, 22, 502–508. [Google Scholar] [CrossRef]
- Sánchez, D.; Isern, D. Automatic extraction of acronym definitions from the Web. Applied Intelligence 2011, 34, 311–327. [Google Scholar] [CrossRef]
- Stevenson, M.; Greenwood, M.A. A semantic approach to IE pattern induction. Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 2005, pp. 379–386.
- Rissland, E.L. AI and similarity. IEEE Intelligent Systems 2006, pp. 39–49.
- Fonseca, F. Ontology-Based Geospatial Data Integration. In Encyclopedia of GIS; 2008; pp. 812–815.
- Kastrati, Z.; Imran, A.S.; Yildirim-Yayilgan, S. SEMCON: a semantic and contextual objective metric for enriching domain ontology concepts. International Journal on Semantic Web and Information Systems (IJSWIS) 2016, 12, 1–24. [Google Scholar] [CrossRef]
- Sánchez, D. A methodology to learn ontological attributes from the Web. Data & Knowledge Engineering 2010, 69, 573–597. [Google Scholar]
- Song, W.; Li, C.H.; Park, S.C. Genetic algorithm for text clustering using ontology and evaluating the validity of various semantic similarity measures. Expert Systems with Applications 2009, 36, 9095–9104. [Google Scholar] [CrossRef]
- Batet, M.; Sánchez, D.; Valls, A. An ontology-based measure to compute semantic similarity in biomedicine. Journal of biomedical informatics 2011, 44, 118–125. [Google Scholar] [CrossRef]
- Couto, F.M.; Silva, M.J.; Coutinho, P.M. Measuring semantic similarity between Gene Ontology terms. Data & Knowledge Engineering 2007, 61, 137–152. [Google Scholar] [CrossRef]
- Pedersen, T.; Pakhomov, S.V.; Patwardhan, S.; Chute, C.G. Measures of semantic similarity and relatedness in the biomedical domain. Journal of biomedical informatics 2007, 40, 288–299. [Google Scholar] [CrossRef] [PubMed]
- Sánchez, D.; Batet, M. Semantic similarity estimation in the biomedical domain: An ontology-based information-theoretic perspective. Journal of biomedical informatics 2011, 44, 749–759. [Google Scholar] [CrossRef]
- Moreno, M. Similitud Semantica entre Sistemas de Objetos Geograficos Aplicada a la Generalizacion de Datos Geo-espaciales. PhD thesis, 2007.
- Nedas, K.; Egenhofer, M. Spatial-Scene Similarity Queries. Transactions in GIS 2008, 12, 661–681. [Google Scholar] [CrossRef]
- Rodríguez, M.A.; Egenhofer, M.J. Determining semantic similarity among entity classes from different ontologies. Knowledge and Data Engineering, IEEE Transactions on 2003, 15, 442–456. [Google Scholar] [CrossRef]
- Sheeren, D.; Mustière, S.; Zucker, J.D. A data mining approach for assessing consistency between multiple representations in spatial databases. International Journal of Geographic Information Science 2009, 23, 961–992. [Google Scholar] [CrossRef]
- Goldstone, R.L.; Medin, D.L.; Halberstadt, J. Similarity in context. Memory & Cognition 1997, 25, 237–255. [Google Scholar]
- Miller, G.A. WordNet: a lexical database for English. Communications of the ACM 1995, 38, 39–41. [Google Scholar] [CrossRef]
- Tversky, A.; Gati, I. Studies of similarity. Cognition and categorization 1978, 1, 79–98. [Google Scholar]
- Chu, H.C.; Chen, M.Y.; Chen, Y.M. A semantic-based approach to content abstraction and annotation for content management. Expert Systems with Applications 2009, 36, 2360–2376. [Google Scholar] [CrossRef]
- Sánchez, D.; Isern, D.; Millan, M. Content annotation for the semantic web: an automatic web-based approach. Knowledge and Information Systems 2011, 27, 393–418. [Google Scholar] [CrossRef]
- Jiang, J.J.; Conrath, D.W. Semantic similarity based on corpus statistics and lexical taxonomy. Proceedings of the international conference on research in computational linguistics, 1997, pp. 19–33.
- Wu, Z.; Palmer, M. Verbs semantics and lexical selection. Proceedings of the 32nd annual meeting on Association for Computational Linguistics. Association for Computational Linguistics, 1994, pp. 133–138.
- Resnik, P. Using information content to evaluate semantic similarity in a taxonomy. arXiv 1995, arXiv:9511007. [Google Scholar]
- Rada, R.; Mili, H.; Bicknell, E.; Blettner, M. Development and application of a metric on semantic nets. Systems, Man and Cybernetics, IEEE Transactions on 1989, 19, 17–30. [Google Scholar] [CrossRef]
- Jiang, Y.; Bai, W.; Zhang, X.; Hu, J. Wikipedia-based information content and semantic similarity computation. Information Processing & Management 2017, 53, 248–265. [Google Scholar] [CrossRef]
- Mathur, S.; Dinakarpandian, D. Finding disease similarity based on implicit semantic similarity. Journal of Biomedical Informatics 2012, 45, 363–371. [Google Scholar] [CrossRef] [PubMed]
- Batet, M.; Sánchez, D.; Valls, A.; Gibert, K. Semantic similarity estimation from multiple ontologies. Applied intelligence 2013, 38, 29–44. [Google Scholar] [CrossRef]
- Ahsaee, M.G.; Naghibzadeh, M.; Naeini, S.E.Y. Semantic similarity assessment of words using weighted WordNet. International Journal of Machine Learning and Cybernetics 2014, 5, 479–490. [Google Scholar] [CrossRef]
- Liu, H.; Bao, H.; Xu, D. Concept vector for semantic similarity and relatedness based on WordNet structure. Journal of Systems and software 2012, 85, 370–381. [Google Scholar] [CrossRef]
- Maguitman, A.G.; Menczer, F.; Erdinc, F.; Roinestad, H.; Vespignani, A. Algorithmic computation and approximation of semantic similarity. World Wide Web 2006, 9, 431–456. [Google Scholar] [CrossRef]
- Medelyan, O.; Milne, D.; Legg, C.; Witten, I.H. Mining meaning from Wikipedia. International Journal of Human-Computer Studies 2009, 67, 716–754. [Google Scholar] [CrossRef]
- Pirró, G. A semantic similarity metric combining features and intrinsic information content. Data & Knowledge Engineering 2009, 68, 1289–1308. [Google Scholar]
- Meng, L.; Huang, R.; Gu, J. A review of semantic similarity measures in wordnet. International Journal of Hybrid Information Technology 2013, 6, 1–12. [Google Scholar]
- Lin, D. An information-theoretic definition of similarity. ICML 1998, 98, 296–304. [Google Scholar]
- Resnik, P. Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. J. Artif. Intell. Res.(JAIR) 1999, 11, 95–130. [Google Scholar] [CrossRef]
- Sánchez, D.; Batet, M.; Isern, D. Ontology-based information content computation. Knowledge-Based Systems 2011, 24, 297–303. [Google Scholar] [CrossRef]
- Seco, N.; Veale, T.; Hayes, J. An intrinsic information content metric for semantic similarity in WordNet. ECAI 2004, 16, 1089. [Google Scholar]
- Zhou, Z.; Wang, Y.; Gu, J. A new model of information content for semantic similarity in WordNet. Future Generation Communication and Networking Symposia, 2008. FGCNS’08. Second International Conference on. IEEE 2008, 3, 85–89. [Google Scholar]
- Sánchez, D.; Batet, M.; Isern, D.; Valls, A. Ontology-based semantic similarity: A new feature-based approach. Expert Systems with Applications 2012, 39, 7718–7728. [Google Scholar] [CrossRef]
- Petrakis, E.G.; Varelas, G.; Hliaoutakis, A.; Raftopoulou, P. X-similarity: computing semantic similarity between concepts from different ontologies. JDIM 2006, 4, 233–237. [Google Scholar]
- Ding, L.; Finin, T.; Joshi, A.; Pan, R.; Cost, R.S.; Peng, Y.; Reddivari, P.; Doshi, V.; Sachs, J. Swoogle: a search and metadata engine for the semantic web. Proceedings of the thirteenth ACM international conference on Information and knowledge management. ACM, 2004, pp. 652–659.
- Leacock, C.; Chodorow, M. Combining local context and WordNet similarity for word sense identification. WordNet: An electronic lexical database 1998, 49, 265–283. [Google Scholar]
- Li, Y.; Bandar, Z.; McLean, D.; others. An approach for measuring semantic similarity between words using multiple information sources. Knowledge and Data Engineering, IEEE Transactions on 2003, 15, 871–882. [Google Scholar]
- Schickel-Zuber, V.; Faltings, B. OSS: A Semantic Similarity Function based on Hierarchical Ontologies. IJCAI 2007, 7, 551–556. [Google Scholar]
- Schwering, A. Hybrid model for semantic similarity measurement. In On the Move to Meaningful Internet Systems 2005: CoopIS, DOA, and ODBASE; Springer, 2005; pp. 1449–1465.
- Martinez-Gil, J.; Aldana-Montes, J.F. Semantic similarity measurement using historical google search patterns. Information Systems Frontiers 2013, 15, 399–410. [Google Scholar] [CrossRef]
- Retzer, S.; Yoong, P.; Hooper, V. Inter-organisational knowledge transfer in social networks: A definition of intermediate ties. Information Systems Frontiers 2012, 14, 343–361. [Google Scholar] [CrossRef]
- Quintero, R.; Torres-Ruiz, M.; Menchaca-Mendez, R.; Moreno-Armendariz, M.A.; Guzman, G.; Moreno-Ibarra, M. DIS-C: conceptual distance in ontologies, a graph-based approach. Knowledge and Information Systems 2019, 59, 33–65. [Google Scholar] [CrossRef]
- Torres, M.; Quintero, R.; Moreno-Ibarra, M.; Menchaca-Mendez, R.; Guzman, G. GEONTO-MET: An Approach to Conceptualizing the Geographic Domain. International Journal of Geographic Information Science 2011, 25, 1633–1657. [Google Scholar] [CrossRef]
- Zadeh, P.D.H.; Reformat, M.Z. Assessment of semantic similarity of concepts defined in ontology. Information Sciences 2013, 250, 21–39. [Google Scholar] [CrossRef]
- Albertoni, R.; De Martino, M. Semantic similarity of ontology instances tailored on the application context. In On the Move to Meaningful Internet Systems 2006: CoopIS, DOA, GADA, and ODBASE; Springer, 2006; pp. 1020–1038.
- Li, Y.; McLean, D.; Bandar, Z.; O’shea, J.D.; Crockett, K.; others. Sentence similarity based on semantic nets and corpus statistics. Knowledge and Data Engineering, IEEE Transactions on 2006, 18, 1138–1150. [Google Scholar] [CrossRef]
- Cilibrasi, R.L.; Vitanyi, P. The google similarity distance. Knowledge and Data Engineering, IEEE Transactions on 2007, 19, 370–383. [Google Scholar] [CrossRef]
- Bollegala, D.; Matsuo, Y.; Ishizuka, M. Measuring semantic similarity between words using web search engines. www 2007, 7, 757–766. [Google Scholar]
- Miller, G.A.; Charles, W.G. Contextual correlates of semantic similarity. Language and cognitive processes 1991, 6, 1–28. [Google Scholar] [CrossRef]
- Sánchez, D.; Moreno, A.; Del Vasto-Terrientes, L. Learning relation axioms from text: An automatic Web-based approach. Expert Systems with Applications 2012, 39, 5792–5805. [Google Scholar] [CrossRef]
- Saruladha, K.; Aghila, G.; Bhuvaneswary, A. Information content based semantic similarity for cross ontological concepts. International Journal of Engineering Science and Technology 2011, 3. [Google Scholar]
- Formica, A. Ontology-based concept similarity in formal concept analysis. Information Sciences 2006, 176, 2624–2641. [Google Scholar] [CrossRef]
- Albacete, E.; Calle-Gómez, J.; Castro, E.; Cuadra, D. Semantic Similarity Measures Applied to an Ontology for Human-Like Interaction. J. Artif. Intell. Res.(JAIR) 2012, 44, 397–421. [Google Scholar] [CrossRef]
- Goldstone, R. An efficient method for obtaining similarity data. Behavior Research Methods, Instruments, & Computers 1994, 26, 381–386. [Google Scholar]
- Niles, I.; Pease, A. Towards a standard upper ontology. Proceedings of the international conference on Formal Ontology in Information Systems-Volume 2001. ACM, 2001, pp. 2–9.
- Fellbaum, C. WordNet: An electronic database; MIT Press, Cambridge, MA, 1998.
- Jain, P.; Yeh, P.Z.; Verma, K.; Vasquez, R.G.; Damova, M.; Hitzler, P.; Sheth, A.P. Contextual ontology alignment of lod with an upper ontology: A case study with proton. In The Semantic Web: Research and Applications; Springer, 2011; pp. 80–92.
- Héja, G.; Surján, G.; Varga, P. Ontological analysis of SNOMED CT. BMC medical informatics and decision making 2008, 8, S8. [Google Scholar] [CrossRef] [PubMed]
- Consortium, G.O.; others. The Gene Ontology (GO) database and informatics resource. Nucleic acids research 2004, 32, D258–D261. [Google Scholar] [CrossRef]
- Gangemi, A.; Guarino, N.; Masolo, C.; Oltramari, A.; Schneider, L. Sweetening ontologies with DOLCE. In Knowledge engineering and knowledge management: Ontologies and the semantic Web; Springer, 2002; pp. 166–181.
- Buggenhout, C.V.; Ceusters, W. A novel view on information content of concepts in a large ontology and a view on the structure and the quality of the ontology. International Journal of Medical Informatics 2005, 74, 125–132. [Google Scholar] [CrossRef]
- Ponzetto, S.P.; Strube, M. Knowledge derived from Wikipedia for computing semantic relatedness. Journal of Artificial Intelligence Research 2007, 30, 181–212. [Google Scholar] [CrossRef]
- Ittoo, A.; Bouma, G. Minimally-supervised extraction of domain-specific part–whole relations using Wikipedia as knowledge-base. Data & Knowledge Engineering 2013, 85, 57–79. [Google Scholar]
- Kaptein, R.; Kamps, J. Exploiting the category structure of Wikipedia for entity ranking. Artificial Intelligence 2013, 194, 111–129. [Google Scholar] [CrossRef]
- Nothman, J.; Ringland, N.; Radford, W.; Murphy, T.; Curran, J.R. Learning multilingual named entity recognition from Wikipedia. Artificial Intelligence 2013, 194, 151–175. [Google Scholar] [CrossRef]
- Sorg, P.; Cimiano, P. Exploiting Wikipedia for cross-lingual and multilingual information retrieval. Data & Knowledge Engineering 2012, 74, 26–45. [Google Scholar]
- Yazdani, M.; Popescu-Belis, A. Computing text semantic relatedness using the contents and links of a hypertext encyclopedia. Artificial Intelligence 2013, 194, 176–202. [Google Scholar] [CrossRef]
- Hirst, G.; St-Onge, D.; others. Lexical chains as representations of context for the detection and correction of malapropisms. WordNet: An electronic lexical database 1998, 305, 305–332. [Google Scholar]
- Rubenstein, H.; Goodenough, J.B. Contextual correlates of synonymy. Communications of the ACM 1965, 8, 627–633. [Google Scholar] [CrossRef]
- Jarmasz, M.; Szpakowicz, S. Roget’s Thesaurus and Semantic Similarity. Proceedings of the International Conference on Recent Advances in Natural Language Processing 2003, pp. 212–219.
| 1 | The asymmetry property does not hold for conceptual distance (). As a result, we express the conceptual distance from term A to term B (DIS-C(to) column), from term B to term A (DIS-C(from) column), the average of these distances (DIS-C(avg) column), the minimum (DIS-C(min) column), and the maximum (DIS-C(max) column). |



| Word A | Word B | Miller and Charles (1991) [75] | WordNet edges | Hirst et al. (1998) [94] | Jiang and Conrath (1997) [41] | Leacock and Chodorow (1998) [62] | Lin (1998) [54] | Resnik (1995) [43] | DIS-C(to) | DIS-C(from) | DIS-C(avg) | DIS-C(min) | DIS-C(max) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| asylum | madhouse | 3.61 | 29.00 | 4.00 | 0.66 | 2.77 | 0.98 | 11.28 | 1.22 | 1.64 | 1.43 | 1.22 | 1.64 |
| bird | cock | 3.05 | 29.00 | 6.00 | 0.16 | 2.77 | 0.69 | 5.98 | 0.63 | 0.33 | 0.48 | 0.33 | 0.63 |
| bird | crane | 2.97 | 27.00 | 5.00 | 0.14 | 2.08 | 0.66 | 5.98 | 1.51 | 1.35 | 1.43 | 1.35 | 1.51 |
| boy | lad | 3.76 | 29.00 | 5.00 | 0.23 | 2.77 | 0.82 | 7.77 | 0.96 | 0.96 | 0.96 | 0.96 | 0.96 |
| brother | monk | 2.82 | 29.00 | 4.00 | 0.29 | 2.77 | 0.90 | 10.49 | 0.33 | 0.63 | 0.48 | 0.33 | 0.63 |
| car | automobile | 3.92 | 30.00 | 16.00 | 1.00 | 3.47 | 1.00 | 6.34 | 1.26 | 0.59 | 0.92 | 0.59 | 1.26 |
| cemetery | woodland | 0.95 | 21.00 | 0.00 | 0.05 | 1.16 | 0.07 | 0.70 | 3.21 | 2.49 | 2.85 | 2.49 | 3.21 |
| chord | smile | 0.13 | 20.00 | 0.00 | 0.07 | 1.07 | 0.29 | 2.89 | 2.67 | 3.95 | 3.31 | 2.67 | 3.95 |
| coast | forest | 0.42 | 24.00 | 0.00 | 0.06 | 1.52 | 0.12 | 1.18 | 1.84 | 2.89 | 2.37 | 1.84 | 2.89 |
| coast | hill | 0.87 | 26.00 | 2.00 | 0.15 | 1.86 | 0.69 | 6.38 | 1.22 | 1.58 | 1.40 | 1.22 | 1.58 |
| coast | shore | 3.70 | 29.00 | 4.00 | 0.65 | 2.77 | 0.97 | 8.97 | 0.33 | 0.63 | 0.48 | 0.33 | 0.63 |
| crane | implement | 1.68 | 26.00 | 3.00 | 0.09 | 1.86 | 0.39 | 3.44 | 1.55 | 1.82 | 1.69 | 1.55 | 1.82 |
| food | fruit | 3.08 | 23.00 | 0.00 | 0.09 | 1.39 | 0.12 | 0.70 | 0.85 | 1.58 | 1.21 | 0.85 | 1.58 |
| food | rooster | 0.89 | 17.00 | 0.00 | 0.06 | 0.83 | 0.09 | 0.70 | 2.10 | 1.94 | 2.02 | 1.94 | 2.10 |
| forest | graveyard | 0.84 | 21.00 | 0.00 | 0.05 | 1.16 | 0.07 | 0.70 | 2.27 | 1.55 | 1.91 | 1.55 | 2.27 |
| furnace | stove | 3.11 | 23.00 | 5.00 | 0.06 | 1.39 | 0.24 | 2.43 | 1.26 | 0.62 | 0.94 | 0.62 | 1.26 |
| gem | jewel | 3.84 | 30.00 | 16.00 | 1.00 | 3.47 | 1.00 | 12.89 | 0.58 | 1.31 | 0.94 | 0.58 | 1.31 |
| glass | magician | 0.11 | 23.00 | 0.00 | 0.06 | 1.39 | 0.12 | 1.18 | 2.08 | 2.58 | 2.33 | 2.08 | 2.58 |
| journey | car | 1.16 | 17.00 | 0.00 | 0.08 | 0.83 | 0.00 | 0.00 | 1.24 | 1.59 | 1.42 | 1.24 | 1.59 |
| journey | voyage | 3.84 | 29.00 | 4.00 | 0.17 | 2.77 | 0.70 | 6.06 | 0.26 | 0.68 | 0.47 | 0.26 | 0.68 |
| lad | brother | 1.66 | 26.00 | 3.00 | 0.07 | 1.86 | 0.27 | 2.46 | 1.55 | 2.16 | 1.85 | 1.55 | 2.16 |
| lad | wizard | 0.42 | 26.00 | 3.00 | 0.07 | 1.86 | 0.27 | 2.46 | 1.55 | 2.23 | 1.89 | 1.55 | 2.23 |
| magician | wizard | 3.50 | 30.00 | 16.00 | 1.00 | 3.47 | 1.00 | 9.71 | 0.94 | 0.94 | 0.94 | 0.94 | 0.94 |
| midday | noon | 3.42 | 30.00 | 16.00 | 1.00 | 3.47 | 1.00 | 10.58 | 0.95 | 0.95 | 0.95 | 0.95 | 0.95 |
| monk | oracle | 1.10 | 23.00 | 0.00 | 0.06 | 1.39 | 0.23 | 2.46 | 2.78 | 2.49 | 2.63 | 2.49 | 2.78 |
| monk | slave | 0.55 | 26.00 | 3.00 | 0.06 | 1.86 | 0.25 | 2.46 | 1.90 | 1.47 | 1.69 | 1.47 | 1.90 |
| noon | string | 0.08 | 19.00 | 0.00 | 0.05 | 0.98 | 0.00 | 0.00 | 2.49 | 2.86 | 2.68 | 2.49 | 2.86 |
| rooster | voyage | 0.08 | 11.00 | 0.00 | 0.04 | 0.47 | 0.00 | 0.00 | 2.53 | 3.10 | 2.81 | 2.53 | 3.10 |
| shore | woodland | 0.63 | 25.00 | 2.00 | 0.06 | 1.67 | 0.12 | 1.18 | 1.92 | 1.92 | 1.92 | 1.92 | 1.92 |
| tool | implement | 2.95 | 29.00 | 4.00 | 0.55 | 2.77 | 0.94 | 6.00 | 0.68 | 0.26 | 0.47 | 0.26 | 0.68 |
| Correlation value | |
|---|---|
| Miller and Charles (1991) [75] | 1.00 |
| WordNet edge counting | 0.73 |
| Hirst et al. (1998) [94] | 0.69 |
| Jiang and Conrath (1997) [41] | 0.70 |
| Leacock and Chodorow (1998) [62] | 0.82 |
| Lin (1998) [54] | 0.82 |
| Resnik (1995) [43] | 0.78 |
| Jiang et al. (2017) [45] | 0.82 |
| DIS-C - From word A to B | 0.80 |
| DIS-C - From word B to A | 0.81 |
| DIS-C - Average of distances | 0.84 |
| DIS-C - Min distance | 0.84 |
| DIS-C - Max distance | 0.83 |
| Pair | Term A | Term B | Human scores | DIS-C(to) | DIS-C(from) | DIS-C(avg) | DIS-C(min) | DIS-C(max) |
|---|---|---|---|---|---|---|---|---|
| 1 | Action film | Science fiction film | 2.25 | 0.88 | 1.82 | 1.50 | 0.88 | 1.82 |
| 2 | Aircraft | Airliner | 2.98 | 2.16 | 0.92 | 1.76 | 0.92 | 2.16 |
| 3 | Egyptian pyramids | Great Wall of China | 1.62 | 1.74 | 1.88 | 1.81 | 1.74 | 1.88 |
| 4 | Artificial intelligence | Cloud computing | 1.28 | 1.36 | 1.36 | 1.36 | 1.36 | 1.36 |
| 5 | Blog | 1.16 | 1.35 | 1.35 | 1.35 | 1.35 | 1.35 | |
| 6 | Book | Paper | 1.78 | 1.76 | 1.76 | 1.76 | 1.76 | 1.76 |
| 7 | Computer | Internet | 2.25 | 1.89 | 1.56 | 1.74 | 1.56 | 1.89 |
| 8 | Financial crisis | Bank | 1.92 | 2.01 | 2.27 | 2.15 | 2.01 | 2.27 |
| 9 | Category:Educators | Category:Educational theorists | 3.23 | 2.73 | 3.17 | 2.97 | 2.73 | 3.17 |
| 10 | Food safety | Health education | 1.10 | 1.28 | 1.28 | 1.28 | 1.28 | 1.28 |
| 11 | Fruit | Food | 2.65 | 2.15 | 1.12 | 1.78 | 1.12 | 2.15 |
| 12 | Health | Wealth | 1.74 | 2.50 | 2.33 | 2.42 | 2.33 | 2.50 |
| 13 | Knowledge | Information | 2.99 | 2.24 | 1.20 | 1.86 | 1.20 | 2.24 |
| 14 | Laptop | Tablet computer | 2.99 | 2.17 | 2.17 | 2.17 | 2.17 | 2.17 |
| 15 | Law | Lawyer | 2.36 | 1.65 | 0.68 | 1.34 | 0.68 | 1.65 |
| 16 | Literature | Medicine | 0.48 | 0.69 | 0.69 | 0.69 | 0.69 | 0.69 |
| 17 | Mobile phone | Television | 1.12 | 1.23 | 1.23 | 1.23 | 1.23 | 1.23 |
| 18 | National Basketball Association | Athletic sport | 2.40 | 3.38 | 2.47 | 2.99 | 2.47 | 3.38 |
| 19 | PC game | Online game | 2.35 | 1.73 | 1.73 | 1.73 | 1.73 | 1.73 |
| 20 | People | Human | 2.46 | 1.95 | 0.98 | 1.61 | 0.98 | 1.95 |
| 21 | President | Civil servant | 2.03 | 2.26 | 2.23 | 2.25 | 2.23 | 2.26 |
| 22 | Public transport | Train | 2.62 | 1.97 | 0.88 | 1.61 | 0.88 | 1.97 |
| 23 | Religion | Monk | 2.56 | 2.12 | 2.12 | 2.12 | 2.12 | 2.12 |
| 24 | Scholar | Academia | 2.53 | 2.17 | 2.17 | 2.17 | 2.17 | 2.17 |
| 25 | Scholar | Academic | 3.77 | 2.80 | 2.80 | 2.80 | 2.80 | 2.80 |
| 26 | Social network | 2.78 | 1.30 | 2.16 | 1.83 | 1.30 | 2.16 | |
| 27 | Spring festival | Christmas | 2.19 | 2.18 | 2.51 | 2.35 | 2.18 | 2.51 |
| 28 | Swimming | Water sport | 2.62 | 2.04 | 2.04 | 2.04 | 2.04 | 2.04 |
| 29 | Transport | Car | 2.37 | 0.97 | 2.00 | 1.64 | 0.97 | 2.00 |
| 30 | Travel agency | Service industry | 1.96 | 2.77 | 2.59 | 2.68 | 2.59 | 2.77 |
| term | g(term) | IC |
|---|---|---|
| Academic | 0.4749 | 0.7446 |
| Lawyer | 0.4740 | 0.7466 |
| Public transport | 0.4710 | 0.7529 |
| Scholar | 0.4707 | 0.7535 |
| Scholar | 0.4707 | 0.7535 |
| Christmas | 0.4705 | 0.7540 |
| Literature | 0.4694 | 0.7562 |
| Information | 0.4688 | 0.7577 |
| Blog | 0.4663 | 0.7630 |
| Law | 0.4656 | 0.7645 |
| Civil servant | 0.4655 | 0.7646 |
| 0.4654 | 0.7648 | |
| Airliner | 0.4650 | 0.7658 |
| Aircraft | 0.4619 | 0.7724 |
| Water sport | 0.4615 | 0.7734 |
| Book | 0.4613 | 0.7736 |
| Train | 0.4613 | 0.7737 |
| Service industry | 0.4554 | 0.7866 |
| Travel agency | 0.4547 | 0.7882 |
| Monk | 0.4547 | 0.7882 |
| Transport | 0.4538 | 0.7900 |
| Artificial intelligence | 0.4530 | 0.7919 |
| Human | 0.4505 | 0.7974 |
| Television | 0.4490 | 0.8008 |
| Computer | 0.4479 | 0.8033 |
| Internet | 0.4470 | 0.8052 |
| Mobile phone | 0.4469 | 0.8055 |
| Academia | 0.4445 | 0.8108 |
| Great Wall of China | 0.4444 | 0.8111 |
| Swimming | 0.4443 | 0.8112 |
| People | 0.4442 | 0.8115 |
| Laptop | 0.4441 | 0.8118 |
| Car | 0.4431 | 0.8141 |
| Fruit | 0.4416 | 0.8172 |
| President | 0.4413 | 0.8181 |
| Religion | 0.4410 | 0.8187 |
| National Basketball Association | 0.4393 | 0.8226 |
| Health | 0.4358 | 0.8307 |
| Paper | 0.4353 | 0.8316 |
| Food | 0.4351 | 0.8321 |
| Bank | 0.4349 | 0.8327 |
| Action film | 0.4197 | 0.8683 |
| Science fiction film | 0.4196 | 0.8683 |
| Online game | 0.4154 | 0.8784 |
| Knowledge | 0.4126 | 0.8853 |
| Cloud computing | 0.4112 | 0.8888 |
| Financial crisis | 0.4069 | 0.8993 |
| PC game | 0.4044 | 0.9053 |
| Category:Educators | 0.4028 | 0.9093 |
| Food safety | 0.3986 | 0.9197 |
| Category:Educational theorists | 0.3985 | 0.9200 |
| 0.3857 | 0.9526 | |
| Medicine | 0.3832 | 0.9591 |
| Wealth | 0.3829 | 0.9599 |
| Health education | 0.3747 | 0.9817 |
| Social network | 0.3697 | 0.9950 |
| Athletic sport | 0.3672 | 1.0020 |
| Spring festival | 0.3599 | 1.0220 |
| Tablet computer | 0.3125 | 1.1631 |
| Egyptian pyramids | 0.2726 | 1.2997 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
