Submitted:
30 July 2025
Posted:
31 July 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Literature Review
2.1. AI Ethics Literature Review
2.2. Chance Discovery Theory
- Establishing and uncovering innovative models and variables: Rather than relying on existing data models and variables, this approach incorporates contextual factors into the analysis to identify noteworthy variables emerging in specific situations, preventing the results from diverging from practical needs and enhancing the accuracy of chance detection.
- Identifying tail events: Tail events are rare events with a low frequency but profound influence on the system or domain. Undiscovered chances or phenomena can be identified by observing and analyzing such tail events.
- Relying on human–AI interaction for interpretation and judgment: Whether a tail event represents a genuine chance can be discerned by applying extensive human background knowledge and contextual sensitivity. This approach is necessary because the rarity and ambiguity of tail events make it challenging for fully automated data mining methods to assess their true value and significance accurately.
2.3. Double Helix Model: Human–Machine Collaborative Framework for Chance Discovery
- Human-driven process: New setting for analysis (inputting article data and initializing parameters). The researcher conducts this phase, which marks the starting point of the overall HCI cycle. Based on the research objectives, the researcher provides the original article dataset and employs the Polaris visualization tool to set the initial parameters for the KeyGraph algorithm, such as the number of default bridging nodes (represented as red nodes) and high-frequency black keyword nodes, establishing the foundation for an automated computer analysis (see Phase 1 in Figure 1).
- Computer-driven process: Data mining (data mining and keyword network construction). After the initial parameter setting, the process transitions to a computer-driven process, entering the phases of data mining and keyword network construction. At this stage, the computer autonomously executes the KeyGraph algorithm to conduct in-depth mining of the article dataset. The system constructs a co-occurrence network graph by calculating the co-occurrence frequency and structural relationships between keywords. This phase applies the KeyGraph algorithm to extract latent knowledge structures and keywords automatically from large-scale articles, providing a foundation for semantic interpretation (see Phase 2 in Figure 1).
- Computer-driven process: Visual results (network graph visualization). After data mining is completed, the computer transforms the keyword network generated by the KeyGraph algorithm into a visualized graph illustrating the connections (i.e., the co-occurrence relationships) between keywords, including the red nodes. This visual representation serves as a bridge between the computer and human user, converting abstract keyword associations into intuitive, interpretable images that facilitate information integration and semantic judgment. At this stage, the computer completes its intermediate task and awaits human intervention for further examination (see Phase 3 in Figure 1).
- Human-driven process: Understanding and re-evaluation (interpretation and review). This phase represents the core of human knowledge interpretation in the double helix model, highlighting the iterative nature of HCI. In this process, researchers do not directly engage in topic detection; instead, based on their domain expertise and semantic comprehension abilities, they systematically evaluate the topic detection and semantic interpretation results generated by ChatGPT from the keyword network visualizations constructed using the KeyGraph algorithm. Researchers examine the thematic and keyword relationships in the visualized graphs and assess whether ChatGPT’s initial interpretations are logically coherent, substantively meaningful, and effectively reveal the latent semantics in the articles. For instance, if ChatGPT produces an illogical output (e.g., “I beer”), researchers employ the Polaris visualization tool to adjust the node parameters of the KeyGraph algorithm (e.g., the number of high-frequency black keywords or red chance nodes) until the output becomes coherent and logically consistent (e.g., “I love to drink beer”). This ongoing process of understanding the output and continuously tuning parameters, which triggers new data mining and visual output, is the critical objective of the iterative cycles in the double helix model. Thus, this approach forms a continuously optimized spiral iteration process (see Phase 4 in Figure 1).
2.4. KeyGraph Algorithm Overview
2.4.1. KeyGraph Keyword Network Structure and Visualization
2.4.2. KeyGraph Algorithm
- Data preprocessing: Data preprocessing serves as the foundation for constructing a keyword network in the KeyGraph algorithm. In this study, this phase involves several steps, including tokenization, normalization, stop-word removal, and part-of-speech filtering. Tokenization and normalization establish a stable keyword base, whereas stop-word removal and part-of-speech filtering reduce semantic noise, enhancing the accuracy of co-occurrence analysis and the network structure quality, optimizing the performance of topic detection and chance node identification [38,44,45].
- High-frequency keyword extraction: Based on the preprocessed data, a new dataset, is generated, comprising a series of sentences, each representing a set of keywords. All keywords are ranked according to their frequency of occurrence in dataset, and the highest-frequency keywords are selected to form a high-frequency keyword set. These keywords serve as the nodes of the network cluster[38,44,45,46].
- Calculation formula for keyword network co-occurrence: In the KeyGraph algorithm adopted in this study, the co-occurrence relationship between keywords serves as the core basis for constructing the keyword network. Each keyword is regarded as a network node. When two keywords co-occur in the same semantic unit (e.g., a sentence or paragraph), a link (edge) is formed between the nodes. The KeyGraph algorithm employs a specific measure called the co-occurrence strength to quantify the co-occurrence relationship between keywords [38,47,48], calculated in Eq. (1):where represents the co-occurrence strength between keywords and in all semantic units in datasetThis measure is calculated by summing the minimum occurrence frequencies of the two keywords in the same semantic unit, reflecting their co-occurrence count. In addition, anddenote the frequencies of keywords and, respectively, in the semantic unit indicates the minimum occurrence frequency betweenand in the semantic unitThis metric aggregates the minimal occurrence counts across semantic units to capture the overall semantic linkage strength between the keyword pair. This approach helps construct a semantic backbone comprising high-frequency terms and reveals latent nodes that may have lower surface frequencies yet critical semantic significance.
- 4.
- Co-occurrence measurement between keywords and keyword clusters: The KeyGraph algorithm employs the Co-Occurrence Strength Index called , which measures the degree of their co-occurrence within articles to calculate the connection strength between a keyword and a single keyword cluster [38,47,48], as in Eq. (3):where refers to a retained keyword in the preprocessed dataset , denotes the cluster to which the keyword belongs, and represents the dataset obtained after preprocessing. The co-occurrence strength is calculated based on sentences , the fundamental semantic units, and the sentences are typically treated as sets of keywords that form the basis for defining co-occurrence relationships. In addition, indicates the frequency of keyword appearing in sentence , whereas represents the total number of occurrences of all keywords in cluster , excluding in the same sentence. This value is zero if no other keywords from the cluster appear in the sentence.
- 5.
- Calculating the co-occurrence potential of all keywords in cluster: In a keyword network analysis, the association between a keyword and the keyword cluster depends on the degree of co-occurrence and the contextual interactions between the cluster and other keywords. The KeyGraph algorithm provides a standardized metric for evaluating such associations by defining a cluster-level semantic quantification measure called, estimating the potential of a specific keyword clusterto interact semantically with other keywords throughout the dataset [38,47,48], calculated in Eq. (4):where represents each sentence in dataset, which is regarded as a set of co-occurring keywords and is the basis for defining co-occurrence relationships between keywords. In addition, denotes the set of all keywords, indicates any keyword in set, refers to the frequency of keyword appearing in sentence , and denotes the total frequency of all keywords in cluster (excluding keyword ) appearing in the same sentence . This value is zero if none of the keywords in the cluster appear in the sentence.
- 6.
- Evaluation of the importance of the potential of keywords across clusters: This study adopts the keyness calculation formula proposed in the KeyGraph algorithm to evaluate the connective role played by keyword in the overall keyword network graph to determine whether a specific keyword possesses the semantic potential to bridge clusters [38,47,48], as computed in Eq. (5):where represents the importance score of keyword , ranging between 0 and 1, and denotes the set of all keyword clusters in the keyword network graph, where each refers to an individual keyword cluster. The expression indicates that every cluster in set is evaluated individually to compute the semantic relevance of keyword to each cluster. Moreover, refers to the co-occurrence strength between keyword and cluster, whereas represents the total co-occurrence strength of cluster.
3. Materials and Methods
3.1. Data Collection
3.2. Data Preprocessing
3.3. Construction of the Keyword Co-Occurence Network
3.3.1. Chance Discovery in AI Ethics Using KeyGraph
- Keyword frequency and co-occurrence calculation: First, the occurrence frequency of all words in the articles is calculated and sorted. The top consecutive high-frequency words are selected as keywords, representing the core foundational concepts of the articles. Using paragraphs or sentences as the calculation units, the co-occurrence relationships between all keywords are computed and applied to establish connections.
-
Node role classification and keyword clustering: Based on the frequency of keyword occurrences and their structural positions in the co-occurrence network, nodes are classified into three categories, which lays the foundation for chance discovery.
- ○
- High-frequency keywords: Keywords with high occurrence frequency that are concentrated in specific topic clusters represent the primary concepts of the topics. In this study, these are consistently represented by high-frequency black nodes.
- ○
- Chance keywords: These keywords (known as bridging words) have lower occurrence frequencies but are associated with multiple topic clusters. They typically indicate emerging concepts or interdisciplinary issues and are valuable for discovering latent topics. In this study, they are represented by red nodes.
- ○
- General terms: Keywords lacking structural significance are excluded from the visualization network.
- Keyword co-occurrence network construction and thematic cluster identification: A keyword association graph is constructed with keywords as nodes and the co-occurrence strength as weighted edges. This method aggregates high-frequency terms and forms thematic clusters.
- Keyword network visualization: The nodes and links are visualized using tools (e.g., Polaris), which map co-occurrence relationships between keywords to construct their association network graphs. By adjusting parameters (e.g., frequency thresholds, co-occurrence strength, and the number of nodes), different levels of keyword structures are explored to enhance the understanding of potential keyword clusters and association pathways.
3.3.2. Analysis of Keyword Network Node Density and Topic Detection Accuracy
3.4. Selection of High-Frequency Keyword Clusters
3.5. Employing ChatGPT for Topic Detection
3.5.1. Limitations of Previous Methods
- Topic cluster identification and core concept summarization: KeyGraph identifies high-frequency keywords in articles and designates them as high-frequency nodes (i.e., black nodes) in the keyword network structure. Based on the co-occurrence relationships between these keywords, tightly connected clusters naturally form, reflecting the primary themes or subdomains in articles. Researchers can summarize representative thematic labels based on the characteristics and co-occurrence patterns of keywords in each cluster, producing an initial thematic summary and classification of the core article content.
- Chance keyword identification and pairwise semantic relationship mining: The uniqueness of KeyGraph lies in its ability to identify chance keywords that, despite their low frequency, connect multiple thematic clusters. Although these keywords appear infrequently, they serve as bridging nodes linking thematic clusters in the keyword network. Researchers conduct in-depth analyses of these chance keywords by tracing their contextual usage back to the original articles, manually interpreting their semantic roles and how they connect with multiple thematic clusters. This process facilitates identifying emerging topics, interdisciplinary integration points, or potential trends.
- Topic summarization heavily relies on manual interpretation, resulting in subjectivity and inconsistency: Although traditional keyword network graphs can visually present co-occurrence relationships between high-frequency keywords, their semantic connections often lack systematic explanatory mechanisms, typically relying on researchers’ expertise and experience for semantic interpretation and topic detection. This process is time-consuming, labor-intensive, and prone to inconsistencies due to variations in interpreters’ knowledge, affecting the objectivity of topic summarization. These problems become pronounced when analyzing multiple articles or conducting comparative analyses over time.
- Limited ability to identify low-frequency, high-value keywords, making latent topic detection difficult: Traditional text mining methods using statistical frequency focus on topic clusters formed by high-frequency keywords, often overlooking low-frequency keywords and chance nodes that play bridging or transitional roles in the keyword structure. These low-frequency keywords often represent emerging concepts, topic intersections, or contextual shifts, holding significant value for uncovering latent research topics and policy chance information. However, traditional methods struggle to identify and interpret their semantic roles systematically, limiting the efficiency and usefulness of topic exploration.
- Difficulty tracking dynamic contexts hinders automating topic-evolution pattern analysis: When managing cross-temporal texts, such as AI ethics articles from 2022 to 2024, traditional keyword network analysis often requires a manual comparison of keyword structural changes at various time points and cannot effectively or automatically track how topic keywords undergo semantic shifts or experience topic merging and splitting as the context evolves. This limitation hinders researchers’ understanding and forecasting of topic evolution trajectories, resulting in analyses without the capacity to present temporal and dynamic characteristics.
- Visualization maps are challenging to convert into structured data for inference: Although keyword network graphs offer a high degree of visual intuitiveness and help reveal thematic contexts and lexical and relational structures in texts, their results are often presented as images. When the number of keyword nodes in topic clusters is high, the clarity and readability of these visuals significantly decrease, leading to blurred outcomes or difficulty in interpretation during advanced analyses (e.g., topic classification, semantic comparison, or cross-validation).
3.5.2. Technical Background: Semantic Comprehension and Topic Extraction in ChatGPT
- Comprehension of keyword network structures and semantic interpretation: ChatGPT tokenizes the input text, including the original AI ethics articles and translated descriptions of the KeyGraph keyword network structure, and processes it via its multilayer transformer model for deep syntactic and semantic analyses. The built-in attention mechanism in ChatGPT accurately captures complex relationships between tokens and their contextual meaning, constructing a comprehensive, detailed semantic representation. This approach enables the model to understand the meaning of individual tokens and their positions and roles in the keyword network.
- Topic identification: The model identifies frequently recurring keywords and their semantic relationships in the text, grouping them into coherent thematic clusters. Notably, ChatGPT applies its strong contextual reasoning to generate semantically complete and representative thematic descriptions, facilitating the discovery of core concepts in the network structure.
- Semantic interpretation and text summarization: ChatGPT extracts critical insight from text based on semantic logic and generates contextually coherent and concise summaries. Researchers can control the content and length of these summaries using precise prompt engineering (e.g., restricting the summary to the imported text) to meet specific analytical requirements. This control considerably enhances the efficiency of extracting insight from complex network graphs [66,67].
3.5.3. Method: Integrating KeyGraph and ChatGPT for Topic Detection
3.6. R1 Semantic Diffusion Path
4. Result Analysis
4.1. Yearly Analysis of Topic Evolution and Keyword Structures (2022–2024)
- Cluster A-1: The semantic cluster around the red node automaker focuses on the implementation of autonomous driving technology and the ethical challenges faced by AI in automotive applications. This red node extends through its connection to self to include the keywords based, car, driver, vehicles, and autonomous, outlining application scenarios involving HCI. The keywords driver, task, and autonomous intertwine, reflecting issues of responsibility allocation and control authority. In situations where automated and manual control are combined, the attribution of responsibility for accidents (whether borne by the driver or system) requires further clarification via regulatory frameworks and technical design. Furthermore, task transparency and the interface design are also critical. For example, whether drivers can quickly grasp the operational status and decision rationale of the system directly affects their safety judgments and behavioral responses. Establishing trust and risk perception cannot be overlooked. An insufficient HCI design and information transmission may cause driver overtrust or erroneous reliance, increasing safety risks. Overall, the keyword structure emphasizes several topics, including the behavior prediction of autonomous technologies, system safety, and user responsibility attribution.
- Cluster A-2: The red node behavior forms a keyword network related to AI risk prediction, system deployment, and ethical practices. Through its connections to consequences, the network gradually expands to include the keywords risk, privacy, discrimination, design, and capabilities, reflecting the multifaceted and uncertain outcomes of AI system behavior. Notably, discrimination is intertwined with risk, indicating that failure to address data sources and algorithmic bias properly in real-world applications may reinforce existing societal inequalities and trigger ethical crises of systemic discrimination. The association between design and foundational highlights the need to judiciously consider fundamental principles and ethical values during the initial stages of AI development. Overall, this cluster maps the potential externalities that may arise during AI deployment, emphasizing that developers must assume the corresponding responsibility for the potential social and ethical consequences of system behavior.
- Cluster A-3: The keyword network extended from the red node statistical focuses on the computational logic and algorithmic architecture of AI systems. The strong co-occurrence relationships, with the keywords computational, learning, machine, critical, and implementation, reveals core problems including statistical biases, risk governance, and explainability in current AI technology. The direct and indirect connections between the keywords issues, concerns, ethical, implementation, and critical reflect that AI ethics is not merely a conceptual discussion but is involved in the development, design, and deployment stages of AI systems. Furthermore, the connections emphasize that the realization of AI ethics must integrate value judgments and ethical norms as essential foundations for technical practice. This cluster demonstrates the role of ethical issues in institutional frameworks, industrial applications, and technical design, indicating that ethical practice has become a critical factor that cannot be overlooked in the development of responsible technology.
- Cluster A-4: The keyword network constructed around the red node dignity focuses on human rights protection and ethical principles. This red node displays high co-occurrence frequencies with the keyword responsibility, trust, justice, transparency, principles, and ethics, reflecting that current AI technology developers should assume the corresponding moral responsibilities to avoid problems (e.g., bias, discrimination, and structural inequality). Ensuring the transparency of algorithms and data processing allows users to understand the decision-making logic and behavioral patterns of AI systems, safeguarding human dignity and fundamental rights. The connections between justice, guidelines, and harm highlight the necessity of designing AI ethical frameworks and indicate that the lack of appropriate ethical judgment and operational guidance may harm individuals or society, causing discrimination or unfairness. Overall, this cluster focuses on protecting human rights and strengthening ethical norms and institutional justice as core principles, constructing an AI governance mechanism characterized by social legitimacy and long-term trust.
- Combined cluster of A-2 and A-3: The keyword network reveals a significant intersection and complementary structure, highlighting the dual technological and societal dimensions of AI ethics issues. Through red chance nodes, including bias, risk, understand, issues, and implementation, cross-cluster bridging nodes emerge, uncovering a risk propagation chain that spans from statistical logic to behavioral consequences. Bias often originates from flaws in algorithm design and training data, and further permeates the societal domain after system deployment, leading to concrete and potentially escalating ethical consequences. This analysis indicates that AI ethics challenges must be examined from an integrated, multilayered perspective spanning technical construction and societal influence. Accordingly, ethical practice in AI should focus on identifying and mitigating potential ethical risks during the early stages of technological development (e.g., data preprocessing and model training). A comprehensive ethics governance framework encompassing bias detection, transparency enhancement, and regulatory mechanisms must be promoted to ensure responsible and sustainable AI applications.
- Combined cluster of A-2 and A-4: The analysis reveals that AI behavior must be guided and constrained by ethical principles to prevent harm to human dignity and privacy, enabling the deployment of trustworthy and responsible AI. The behavioral logic of AI systems should be grounded in human rights protection and ethical values, with corresponding regulations (e.g., bias detection and privacy protection standards) introduced during the early design stages to ensure legitimacy and credibility during deployment. The consequences of AI behavior (e.g., bias and privacy infringement) must be directed by ethical principles and implemented through technical practice. This interactive relationship emphasizes that ethics should not be treated as an external constraint to technology but as an internal structure embedded throughout the life cycle of AI design, development, and application, advancing responsible and human-centered AI development. This perspective aligns with the discourse in the 2022 AI ethics articles, including The 2022 AI Index: AI’s Ethical Growing Pains and AI Ethics and AI Law: Grappling with Overlapping and Conflicting Ethical Factors Within AI, and identifies the integration of bias management and privacy protection into a unified ethical framework as an emerging research chance.
- Combined cluster of A-2, A-3, and A-4: The semantic co-construction of these three clusters reveals that AI ethics challenges cannot be viewed as problems confined to a single level. The behavioral risks of AI systems (e.g., technical bias, discriminatory outcomes, and privacy infringement) are closely linked to their underlying statistical construction logic, indicating that once deployed, AI may produce irreversible and substantive ethical consequences. If such consequences are not addressed through institutionalized ethical safeguards that ensure prevention and accountability, AI technology risks losing social trust and legitimacy. Moreover, ethical AI practice must adopt a cross-level integration approach to address these challenges, spanning from model training and system deployment to institutional regulation, constructing a full-process ethical governance framework based on the triad of technology, behavior, and values. This structure is critical for preserving human dignity and developing trustworthy and responsible AI.
- Cluster B-1: With trained as the primary node, the network extends to the keyword data and further expands to the keywords models, privacy, and customer, reflecting early-stage concerns in AI development regarding the legitimacy of data sources and the protection of user information. The node models branches out to include intelligence, ChatGPT, generative, and bias, indicating attention to the algorithmic biases embedded in generative AI models (e.g., ChatGPT). The bidirectional links between privacy, customer, and system highlight ethical considerations regarding user privacy and data security in AI application contexts. The connections between system and the keywords customer, create, and generative reveal the interplay between system design and generative technology in practice, raising concerns about technological transparency and ethical accountability. The keyword artificial is linked to intelligence, lead, and ChatGPT, forming a semantic structure centered on AI model generation and leadership in application. This cluster reveals deep ethical concerns related to the legitimacy of data usage, model bias, privacy protection, and user participation during the training and deployment phases of AI systems.
- Cluster B-2: The primary node develop connects with systems and human, revealing the bidirectional relationship of HCI in technological construction. Systems further expands to make and decisions, reflecting the role AI systems play in decision-making processes. Decisions links to making, humans, and believe, forming a cluster centered on how AI decision-making influences human beliefs. Technology co-occurs with the terms ethics and concerns, indicating heightened attention to ethical regulations and institutional policies during AI development. Through the node concerns, the keyword ethics connects to potential, business, and responsibility, outlining the importance businesses place on ethical risks and responsibilities when applying AI technology. Overall, this semantic group illustrates the institutional and ethical challenges faced during AI development, emphasizing the importance of bias governance, technical regulation, and establishing user trust.
- Cluster B-3: With misuse as the red node, the initial connection to government further extends to industry and society, forming a semantic cluster focusing on institutional roles. The node industry links to insurance, which connects to using, policy, and responsible, highlighting an ethical discourse focused on risk transfer mechanisms and institutional responsibility. The keyword policy is a central node connecting responsible, insurance, and using, indicating that policy should address AI misuse risks via clear responsibility allocation, technical application guidelines, and industry-level risk management, especially concerning privacy protection and social impact. The keywords ethical, ensure, responsible, and using are closely interlinked, underscoring that ethical principles must be embedded in technical usage and institutional regulation. These principles, when supported by accountability structures and protective measures, can mitigate risks of misuse, particularly in areas related to data privacy and societal consequences. The connection between impact and society further indicates the potential and far-reaching effects of technological misuse on social structures. Overall, this semantic cluster illustrates that AI ethical principles should be integrated into institutional design and technological application processes and that clear accountability and regulatory mechanisms are critical for reducing the potential negative influences of AI misuse on societal systems.
- Combined cluster of B-1 and B-2: These two clusters, centered on the red node machine, focus on model training and system development, respectively, revealing, through the lens of practical application, the crucial ethical challenges spanning the AI life cycle, from data training and system development to deployment. Both clusters emphasize data ethics (e.g., privacy and bias) and the governance of potential negative influences of AI systems on society and humanity, including decision-making influence and responsibility attribution. Together, these semantic clusters reveal that the core of AI ethics lies in the technology itself and, more critically, in the processes of interaction between AI, humans, and society, particularly regarding risk management and the realization of accountability. The clusters collectively emphasize that achieving a vision of AI development that balances innovation and responsibility requires the parallel construction of responsible governance mechanisms throughout the innovation process.
- Combined cluster of B-2 and B-3: In the KeyGraph keyword network, Clusters 2 and 3 are centered on the keywords develop and misuse, respectively, illustrating an ethical link from AI technology development to its potential misuse. The keyword structures revealed by these two clusters reflect that AI ethics challenges originate from individual acts of technical development and extend across broader societal institutions and governance dimensions. The ethical risks posed by AI technology can be effectively addressed only by constructing an integrated accountability framework encompassing development, deployment, and misuse prevention, ensuring that advancement contributes to positive and sustainable social value.
- Combined cluster of B-1, B-2, and B-3: These three semantic clusters correspond to three stages of AI ethical risk, model training, system development, and actual misuse along with social impact, respectively, forming a progressive chain from ethical considerations to governance responses. The keyword structures reveal a trajectory that begins with micro-level concerns, including data bias and generative misinformation, and extends to challenges of decision-making and ethical design during the development process. The structures indicate misuse risks and governance responsibility at the societal level. This progression reflects that AI ethics issues are not isolated incidents but constitute a foreseeable and preventable chain of ethical risks. An integrated ethical framework must be established that encompasses data governance, technical design, and misuse prevention, enabling the realization of an AI development vision guided by social values to address multilevel challenges (e.g., bias, manipulation, and misuse).
- Cluster C-1: With media as the red node, the network connects to social, which links to content, genAI, and used, revealing that generative AI technology has been widely integrated into social platforms and public communication spaces. The connection between social and ethical, further extending to risks and then deployment, challenges, technology, and responsible, indicates that societal concerns have shifted beyond technical applications to the ethical risks and responsibility attribution involved in deployment processes, especially regarding misinformation, information manipulation, and bias problems arising in social media environments. The keywords digital, technology, innovation, industry, and development converge at the nodes essential, become, and important, demonstrating that generative AI has become a core driving force behind contemporary digital innovation and industrial transformation, with its ethical challenges escalating into systemic problems. Overall, this semantic cluster highlights that AI ethics attention has moved toward ethical challenges triggered by the application of generative AI in social and media contexts, emphasizing the importance of responsible technological deployment in these settings.
- Cluster C-2: The red node security co-occurs with technologies and extends to data, models, and training, forming a semantic cluster. The keyword biases forms a triangular co-occurrence structure with these three terms, indicating that data sources and processing methods underpin AI system security, and that biases hidden within training data influence model behavior, representing the intersection of ethics and security. Transparency and privacy connect through technologies and further co-occur with regulatory, reflecting that AI ethics discourse has reached institutional dimensions and emphasizing the reliance on and necessity of regulatory mechanisms for system transparency and privacy protection. Via ensure and tools, the keyword decision links to generative and ChatGPT and is associated with businesses and trust, revealing the critical role of explainability and trust mechanisms in generative AI decision-making processes in corporate and societal applications. Overall, this keyword network reveals the interdisciplinary interconnection of AI ethics issues in 2024, providing a structured analytical perspective for technology development, policy regulation, and industry practice.
- Combined cluster of C-1 and C-2: The integration of these two clusters highlights the increasingly multilayered and interdisciplinary complexity of AI ethics issues in 2024. The AI ethical themes are no longer confined to a single domain but require addressing systemic governance challenges while promoting AI development, especially generative AI, in the fields of digital content and media. These challenges include core concerns, including data privacy, algorithmic bias, social trust, lack of transparency, and regulatory compliance. The importance of achieving trustworthy and responsible AI governance via the collaborative operation of social and technical dimensions is emphasized, with collective responsibility shared by developers, businesses, policymakers, and civil society. The current AI ethical frameworks must be established under risk contexts characterized by uncertainty and the undiscovered and unknown, guiding AI development toward a more legitimate and sustainable future.
4.2. Integrative Analysis and Trend Summary
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
| AI | Artificial intelligence |
| HCI | Hyper-converged Infrastructure |
| LDA | Latent Dirichlet allocation |
| LLM | Large language model |
| SDGs | Sustainable Development Goals |
References
- Shetty, D.K.; Arjunan, R.V.; Cenitta, D.; Makkithaya, K.; Hegde, N.V.; Bhatta B, S.R.; Salu, S.; Aishwarya, T.R.; Bhat, P.; Pullela, P.K. Analyzing AI Regulation through Literature and Current Trends. J Open Innov: Technol Mark Complex 2025, 11, 100508. [Google Scholar] [CrossRef]
- Tallberg, J.; Lundgren, M.; Geith, J. AI Regulation in the European Union: Examining Non-State Actor Preferences. Bus Politics 2024, 26, 218–239. [Google Scholar] [CrossRef]
- Ong, J.C.L.; Chang, S.Y.; William, W.; Butte, A.J.; Shah, N.H.; Chew, L.S.T.; Liu, N.; Doshi-Velez, F.; Lu, W.; Savulescu, J.; Ting, D.S.W. Ethical and Regulatory Challenges of Large Language Models in Medicine. Lancet Digit Health 2024, 6, e428–e432. [Google Scholar] [CrossRef]
- Huang, C.; Zhang, Z.; Mao, B.; Yao, X. An Overview of Artificial Intelligence Ethics. IEEE Trans Artif Intell 2023, 4, 799–819. [Google Scholar] [CrossRef]
- Tabassum A, Elmahjub, E.; Padela, A.I.; Zwitter, A.; Qadir, J. Generative AI and the Metaverse: A Scoping Review of Ethical and Legal Challenges. IEEE Open J Comput Soc 2025, 6, 348–359. [Google Scholar] [CrossRef]
- Taeihagh, A. Governance of Generative AI. Policy Soc 2025, 44, 1–22. [Google Scholar] [CrossRef]
- Morley, J.; Elhalal, A.; Garcia, F.; Kinsey, L.; Mökander, J.; Floridi, L. Ethics as a Service: A Pragmatic Operationalisation of AI Ethics. Minds Mach 2021, 31, 239–256. [Google Scholar] [CrossRef] [PubMed]
- Mittelstadt, B.D. Principles alone cannot guarantee ethical AI. Nat Mach Intell 2019, 1, 501–507. [Google Scholar] [CrossRef]
- Cath, C.; Wachter, S.; Mittelstadt, B.; Taddeo, M.; Floridi, L. Artificial intelligence and the ‘Good Society’: The US, EU, and UK Approach. Sci Eng Ethics 2018, 24, 505–528. [Google Scholar]
- Ohsawa, Y.; McBurney, P. Chance Discovery; Springer-Verlag: Berlin/Heidelberg, Germany, 2003. [Google Scholar]
- Blei, D.M.; Ng, A.Y.; Jordan, M.I. Latent Dirichlet Allocation. J Mach Learn Res 2003, 3, 993–1022. [Google Scholar]
- Vayansky, I.; Kumar, S.A.P. A Review of Topic Modeling Methods. Inf Syst 2020, 94, 101582. [Google Scholar] [CrossRef]
- Barde, B.; Bainwad, A.M. An Overview of Topic Modeling Methods and Tools. In Proceedings of the 2017 International Conference on Intelligent Computing and Control Systems (ICICCS). IEEE: Piscataway, NJ, USA. 2017; pp. 745–750. [Google Scholar]
- Sayyadi, H.; Raschid, L. A Graph Analytical Approach for Topic Detection. ACM Trans Internet Technol 2013, 13, 1–23. [Google Scholar] [CrossRef]
- Hayashi, T.; Ohsawa, Y. Information Retrieval System and Knowledge Base on Diseases Using Variables and Contexts in the Texts. Procedia Comput Sci 2019, 159, 1662–1669. [Google Scholar] [CrossRef]
- Wang, J.; Lai, J.Y.; Lin, Y.H. Social Media Analytics for Mining Customer Complaints to Explore Product Opportunities. Comput Ind Eng 2023, 178, 109104. [Google Scholar] [CrossRef]
- Guler, N.; Kirshner, S.N.; Vidgen, R. A Literature Review of Artificial Intelligence Research in Business and Management Using Machine Learning and ChatGPT. Data Inf Manag 2024, 8, 100076. [Google Scholar] [CrossRef]
- Nissen, H.E. Using Double Helix Relationships to Understand and Change Information Systems. Informing Sci Int J Emerg Transdiscip J 2007, 10, 21–62. [Google Scholar]
- Fjeld, J.; Achten, N.; Hilligoss, H.; Nagy, A.; Srikumar, M. Principled Artificial Intelligence: Mapping Consensus in Ethical and Rights-Based Approaches to Principles for AI. Berkman Klein Center Res Publ 2020, 2020, 1. [Google Scholar] [CrossRef]
- de Fine Licht, K. Resolving Value Conflicts in Public AI Governance: A Procedural Justice Framework. Gov Inf Q 2025, 42, 102033. [Google Scholar] [CrossRef]
- Fruchter, R.; Ohsawa, Y.; Matsumura, N. Knowledge Reuse through Chance Discovery from an Enterprise Design-Build Enterprise Data Store. New Math Nat Comput 2005, 1, 393–406. [Google Scholar] [CrossRef]
- Buolamwini, J.; Gebru, T. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. In Proceedings of the 1st Conference on Fairness, Accountability and Transparency, New York, NY, USA, 23–24 February 2018; Friedler, S.A., Wilson, C., Eds.; PMLR: New York, NY, USA, 2018; Volume 81, pp. 77–91. [Google Scholar]
- Njiru, D.K.; Mugo, D.M.; Musyoka, F.M. Ethical considerations in AI-based user profiling for knowledge management: A critical review. Telemat Inform Rep 2025, 18, 100205. [Google Scholar] [CrossRef]
- Luomala, M.; Naarmala, J.; Tuomi, V. Technology-Assisted Literature Reviews with Technology of Artificial Intelligence: Ethical and Credibility Challenges. Procedia Comput Sci 2025, 256, 378–387. [Google Scholar] [CrossRef]
- Ohsawa, Y. Data Crystallization: A Project beyond Chance Discovery for Discovering Unobservable Events. In Proceedings of the 2005 IEEE International Conference on Granular Computing, Beijing, China, 25–27 July 2005; IEEE: Piscataway, NJ, USA, 2005; Volume 1, pp. 51–56. [Google Scholar]
- Holzinger, A. Human-Computer Interaction and Knowledge Discovery (HCI-KDD): What Is the Benefit of Bringing Those Two Fields to Work Together. In Availability, Reliability, and Security in Information Systems and HCI; Cuzzocrea, A., Kittl, C., Simos, D.E., Weippl, E., Xu, L., Eds.; Springer: Berlin/Heidelberg, Germany, 2013; Volume 8127. [Google Scholar]
- Ohsawa, Y.; Fukuda, H. Chance discovery by stimulated groups of people: Application to understanding consumption of rare food. J Contingencies Crisis Manag 2002, 10, 129–138. [Google Scholar] [CrossRef]
- Ko, N.; Jeong, B.; Choi, S.; Yoon, J. Identifying Product Opportunities Using Social Media Mining: Application of Topic Modeling and Chance Discovery Theory. IEEE Access 2018, 6, 1680–1693. [Google Scholar] [CrossRef]
- Ohsawa, Y. Chance Discoveries for Making Decisions in Complex Real World. New Gener Comput 2002, 20, 143–163. [Google Scholar] [CrossRef]
- Ohsawa, Y.; Nishihara, Y. Innovators’ Marketplace: Using Games to Activate and Train Innovators; Springer: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
- Ho, T.B.; Nguyen, D.D. Chance Discovery and Learning Minority Classes. New Gener Comput 2003, 21, 149–161. [Google Scholar] [CrossRef]
- Ohsawa, Y.; Tsumoto, S. Chance Discoveries in Real World Decision Making: Data-Based Interaction of Human Intelligence and Artificial Intelligence; Springer: Berlin/Heidelberg, Germany, 2006; Volume 30. [Google Scholar]
- Ohsawa, Y.; Nara, Y. Modeling the Process of Chance Discovery by Chance Discovery on Double Helix. In Proceedings of the AAAI Fall Symposium on Chance Discovery; AAAI Press: Arlington, VA, USA, 2002; pp. 33–40. [Google Scholar]
- Wang, H.; Ohsawa, Y. Nishihara, Y. Innovation Support System for Creative Product Design Based On Chance Discovery. Expert Syst Appl 2012, 39, 4890–4897. [Google Scholar] [CrossRef]
- Wang, H.; Ohsawa, Y. Idea discovery: A Scenario-Based Systematic Approach for Decision Making in Market Innovation. Expert Syst Appl 2013, 40, 429–438. [Google Scholar] [CrossRef]
- Yang, S.; Sun, Q.; Zhou, H.; Gong, Z.; Zhou, Y.; Huang, J. A Topic Detection Method Based on KeyGraph and Community Partition. In Proceedings of the 2018 International Conference on Computing and Artificial Intelligence (ICCAI 2018), Chengdu, China, 12–14 May 2018; Association for Computing Machinery: New York, NY, USA, 2018; pp. 30–34. [Google Scholar]
- Ohsawa, Y. KeyGraph as Risk Explorer in Earthquake–Sequence. J Contingencies Crisis Manag 2002, 10, 119–128. [Google Scholar] [CrossRef]
- Ohsawa, Y.; Benson, N. E.; Yachida, M. KeyGraph: automatic indexing by co-occurrence graph based on building construction metaphor. In Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries, 1998; 12-18.
- Wanchia, K.; Yufei, J.; Hsinchun, Y. Discovering Emerging Financial Technological Chances of Investment Management in China via Patent Data. Int J Bus Econ Aff 2020, 5, 1–8. [Google Scholar] [CrossRef]
- Geum, Y.; Kim, M. How to Identify Promising Chances for Technological Innovation: Keygraph-Based Patent Analysis. Adv Eng Inform 2020, 46, 101155. [Google Scholar] [CrossRef]
- Sakakibara, T.; Ohsawa, Y. Gradual-Increase Extraction of Target Baskets as Preprocess for Visualizing Simplified Scenario Maps by KeyGraph. Soft Comput 2007, 11, 783–790. [Google Scholar] [CrossRef]
- Kim, K.-J.; Jung, M.-C.; Cho, S.-B. KeyGraph-Based Chance Discovery for Mobile Contents Management System. Int J Knowl-Based Intell Eng Syst, 2007, 11, 313–320. [Google Scholar] [CrossRef]
- Perera, K.; Karunarathne, D. KeyGraph and WordNet Hypernyms for Topic Detection. In Proceedings of the 2015 12th International Joint Conference on Computer Science and Software Engineering (JCSSE), Chonburi, Thailand, 22–24 July 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 303–308. [Google Scholar]
- Manning, C. D.; Raghavan, P.; Schütze, H. Introduction to Information Retrieval. Cambridge University Press: Cambridge, UK, 2008.
- Liu, B. Sentiment Analysis and Opinion Mining. Synth Lect Hum Lang Technol 2012, 5, 1–167. [Google Scholar]
- Beliga, S.; Meštrović, A.; Martinčić-Ipšić, S. An overview of graph-based keyword extraction methods and approaches. J Inf Organ Sci, 2015, 39, 1–20. [Google Scholar]
- Pan, R.C.; Hong, C.F.; Huang, N. Hsu, F.C.; Wang, L.H.; Chi, T.H. One-Scan KeyGraph Implementation. The 3rd Conference on Evolutionary Computation Applications & 2005 International Workshop on Chance Discovery, Taiwan, 2005.
- Nezu, Y.; Miura, Y. Extracting Keywords on SNS by Successive KeyGraph. In 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI), Baltimore, MD, USA, 2020; 9–11 November 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 997–1003. [Google Scholar]
- Jobin, A.; Ienca, M.; Vayena, E. The Global Landscape of AI Ethics Guidelines. Nat Mach Intell 2019, 19, 389–399. [Google Scholar] [CrossRef]
- Mittelstadt, B.; Allo, P.; Taddeo, M.; Wachter, S.; Floridi, L. The Ethics of Algorithms: Mapping the Debate. Big Data & Society 2016, 3, 1–21. [Google Scholar] [CrossRef]
- Dignum, V. Responsible Artificial Intelligence: How to Develop and Use AI in a Responsible Way. In Artificial Intelligence: Foundations, Theory, and Algorithms. Springer: Cham, Switzerland, 2019. [Google Scholar]
- Okazaki, N.; Ohsawa, Y. Polaris: An Integrated Data Miner for Chance Discovery. In Proceedings of The Third International Workshop on Chance Discovery and Its Management, Crete, Greece, 2003.
- Sayyadi, H.; Hurst, M.; Maykov, A. Event Detection and Tracking in Social Streams. Proc Int AAAI Conf Web Soc Media 2009, 3, 311–314. [Google Scholar] [CrossRef]
- Jo, Y.; Lagoze, C.; Giles, C.L. Detecting Research Topics via the Correlation between Graphs and Texts. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’07), San Jose, CA, USA, 12–15 August 2007; ACM: New York, NY, USA, 2007; pp. 370–379. [Google Scholar]
- Lozano, S.; Calzada-Infante, L.; Adenso-Díaz, B.; García, S. Complex Network Analysis of Keywords Co-Occurrence in the Recent Efficiency Analysis Literature. Scientometrics 2019, 120, 609–629. [Google Scholar] [CrossRef]
- Zhou, Z.; Zou, X.; Lv, X.; Hu, J. Research on Weighted Complex Network Based Keywords Extraction. In Proceedings of the 7th International Conference on Advanced Data Mining and Applications (ADMA 2013), Wuhan, China, 14–16 May 2013; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2013; Volume 8229, pp. 442–452. [Google Scholar]
- Grimmer, J.; Stewart, B.M. Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Polit Anal 2013, 21, 267–297. [Google Scholar] [CrossRef]
- Firoozeh, N.; Nazarenko, A.; Alizon, F.; Daille, B. Keyword extraction: Issues and methods. Nat Lang Eng 2020, 26, 259–291. [Google Scholar] [CrossRef]
- de Graaf, R.; van der Vossen, R. Bits versus brains in content analysis. Comparing the advantages and disadvantages of manual and automated methods for content analysis. Commun 2013, 38, 433–443. [Google Scholar] [CrossRef]
- Lewis, S.C.; Zamith, R.; Hermida, A. Content analysis in an era of big data: A hybrid approach to computational and manual methods. J Broadcast Electron Media 2013, 57, 34–52. [Google Scholar] [CrossRef]
- Feng, Y. Semantic Textual Similarity Analysis of Clinical Text in the Era of LLM. In 2024 IEEE Conference on Artificial Intelligence, (CAI), Singapore, 22–24 May 2024; IEEE: Piscataway, NJ, USA. 2024; pp. 1284–1289. [Google Scholar]
- Papageorgiou, E.; Chronis, C.; Varlamis, I.; Himeur, Y. A Survey on the Use of Large Language Models (LLMs) in Fake News. Future Internet 2024, 16, 298. [Google Scholar] [CrossRef]
- Maktabdar Oghaz, M.; Babu Saheer, L.; Dhame, K.; Singaram, G. Detection and classification of ChatGPT-generated content using deep transformer models. Front Artif Intell 2025, 8, 1458707. [Google Scholar] [CrossRef]
- Wu, J.; Yang, S.; Zhan, R.; Yuan, Y.; Chao, L.S.; Wong, D.F. A Survey on LLM-Generated Text Detection: Necessity, Methods, and Future Directions. Comput Linguist 2025, 51, 275–338. [Google Scholar] [CrossRef]
- Domínguez-Diaz, A.; Goyanes, M.; de-Marcos, L. Automating Content Analysis of Scientific Abstracts Using ChatGPT: A Methodological Protocol and Use Case. MethodsX 2025, 15, 103431. [Google Scholar] [CrossRef] [PubMed]
- Ma, X.; Zhang, Y.; Ding, K.; Yang, J.; Wu, J.; Fan, H. On Fake News Detection with LLM Enhanced Semantics Mining. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024); Al-Onaizan, Y.; Bansal, M.; Chen, Y.-N., Eds.; Association for Computational Linguistics: Miami, FL, USA, 2024; pp. 508–521.
- Yang, X.; Li, Y.; Zhang, X.; Chen, H.; Cheng, W. Exploring the Limits of ChatGPT for Query or Aspect-Based Text Summarization. CoRR 2023, abs/2302.08081.
- Ohsawa, Y. Context design for chance discovery: Words, agents, and interactions. New Gener Comput 2006, 20, 143–164. [Google Scholar] [CrossRef]
- Bang, Y.; Cahyawijaya, S.; Lee, N.; Dai, W.; Su, D.; Wilie, B.; Lovenia, H.; Ji, Z.; Yu, T.; Chung, W.; Do, Q.V.; Xu, Y.; Fung, P. A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity. In Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (IJCNLP–AACL 2023), Park, J.C.; Arase, Y.; Hu, B.; Lu, W.; Wijaya, D.; Purwarianti, A.; Krisnadhi, A.A., Eds.; Association for Computational Linguistics: Nusa Dua, Bali, Indonesia, 2023; pp. 675–718.







| Original Title | Data Sources | Publication Date | |
| 1 | The 2022 AI Index: Industrialization of AI and Mounting Ethical Concerns | Stanford HAI | 2022/03 |
| 2 | AI Ethics And AI Law Grappling With Overlapping And Conflicting Ethical Factors Within AI | Forbes | 2022/11 |
| 3 | The 2022 AI Index: AI’s Ethical Growing Pains | Stanford HAI | 2022/03 |
| 4 | Prioritising AI & Ethics: A perspective on change | Deloitte | 2022/03 |
| 5 | Top Nine Ethical Issues In Artificial Intelligence | Forbes | 2022/10 |
| 6 | AI Ethics And AI Law Are Moving Toward Standards That Explicitly Identify And Manage AI Biases | Forbes | 2022/10 |
| 7 | Evaluating Ethical Challenges in AI and ML | ISACA Journal | 2022/07 |
| 8 | We’re failing at the ethics of AI. Here’s how we make real impact | World Economic Forum, WEF | 2022/01 |
| Original Title | Data Sources | Publication Date | |
| 1 | The Ethics Of AI: Navigating Bias, Manipulation And Beyond | Forbes | 2023/06 |
| 2 | The Ethics Of AI: Balancing Innovation And | Forbes | 2023/12 |
| 3 | Responsibility | Forbes | 2023/07 |
| 4 | AI Ethics In The Age Of ChatGPT—What Businesses Need To Know | Forbes | 2023/05 |
| 5 | 96% Of People Consider Ethical And Responsible AI To Be Important | Forbes | 2023/03 |
| 6 | How Businesses Can Ethically Embrace Artificial Intelligence | CNN | 2023/12 |
| 7 | Experts call for more diversity to combat bias inartificial intelligence | Georgia Tech | 2023/08 |
| 8 | 5 AI Ethics Concerns the Experts Are Debating | Bloomberg | 2023/06 |
| Original Title | Data Sources | Publication Date | |
| 1 | AI’s Trust Problem | Harvard Business Review | 2024/05 |
| 2 | ‘Uncovered, unknown, and uncertain’: Guiding ethics in the age of AI | Yale News | 2024/02 |
| 3 | AI Regulation Is Evolving Globally and Businesses Need to Keep Up | Bloomberg Law | 2024/12 |
| 4 | AI is not ready for primetime | CNN Business | 2024/03 |
| 5 | With AI warning, Nobel winner joins ranks of laureates who’ve cautioned about the risks of their own work | CNN | 2024/12 |
| 6 | Navigating The Ethics Of AI: Is It Fair And Responsible Enough To Use? | Forbes | 2024/11 |
| 7 | AI And Ethics: A Collective Responsibility For A Safer Future | Forbes | 2024/10 |
| 8 | AI Started as a Dream to Save Humanity. Then, Big Tech Took Over. | Bloomberg | 2024/09 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).