Preprint
Article

This version is not peer-reviewed.

AI-Augmented Forensic Intelligence: Dual-Stream Deep Learning and Predictive Analytics for Integrated Toxicological Diagnosis and Criminal Profiling

Submitted:

23 December 2025

Posted:

25 December 2025

You are already at the latest version

Abstract
Background: Forensic investigations increasingly involve complex interactions between novel psychoactive substances (NPS), multidimensional toxicological evidence, and offender behavior, yet current workflows remain siloed between laboratory toxicology and criminal profiling. Rapid evolution of “designer drugs” also challenges conventional spectral libraries and leads to delayed or inconclusive diagnoses in high‑stakes criminal cases. ​Objective: This conceptual methodology paper proposes an integrated, dual‑stream artificial intelligence (AI) framework that fuses computational toxicology and behavioral predictive modeling to support complex toxicological diagnosis and criminal profiling in a unified decision‑support system. ​Methods: Stream A applies one‑dimensional convolutional neural networks (1D‑CNNs) to gas and liquid chromatography–mass spectrometry (GC–MS/LC–MS) spectrograms from established libraries (e.g., SWGDRUG) and internal case files to detect known and emerging NPS, treating spectra as “molecular fingerprints” that can generalize to unseen analogues. Stream B employs ensemble machine‑learning models (Random Forest and gradient boosting) on structured offender‑level data, including modus operandi, victimology, scene characteristics, and toxicology results, to derive an “Offender Risk Category” and aggression‑related risk scores. Model development relies on stratified k‑fold cross‑validation, calibration assessment, and explainable AI (SHAP values) to ensure transparency suitable for judicial scrutiny. ​Results (conceptual): The proposed framework is designed to deliver: (1) automated toxicological triage that prioritizes cases with aggression‑inducing or incapacitating substances; (2) probabilistic classification of unknown substances into pharmacological classes (e.g., synthetic stimulants or opioids) even when exact molecules are absent from reference libraries; and (3) an integrated risk score (0–100) quantifying the likelihood that observed crime scene behavior aligns with substance‑driven aggression rather than purely premeditated violence 3. ​Conclusion: This dual‑stream “AI‑augmented forensics” paradigm operationalizes forensic intelligence by bridging the gap between biological toxicology and criminal profiling while embedding explainability, auditability, and human‑in‑the‑loop oversight to support court‑admissible expert opinion. Future work should implement and prospectively validate this framework on multi‑jurisdictional datasets and examine its impact on turnaround time, diagnostic accuracy, and bias mitigation in forensic decision‑making.
Keywords: 
;  ;  ;  ;  ;  ;  

Introduction

The proliferation of novel psychoactive substances (NPS) and rapidly evolving “designer drugs” has substantially increased the complexity of forensic toxicology and its interpretation in criminal investigations [3–8,193–8,19]. Conventional workflows often treat toxicological analysis and criminal profiling as separate domains, with minimal formal linkage between biological evidence, behavioral patterns, and risk assessment of offenders [1,2,18,221,2,18,22].
This separation is problematic when psychoactive substances modulate cognition, impulsivity, and aggression, potentially altering the nature and severity of offenses while complicating distinctions between substance-driven behavior and premeditated violence [5,7,8,19,20,225,7,8,19,20,22]. At the same time, GC–MS and LC–MS data streams have grown in volume and dimensionality, outstripping the capacity of manual interpretation and static spectral libraries to keep pace with emerging NPS [3,6,8,193,6,8,19].
Recent advances in artificial intelligence (AI) and machine learning have transformed multiple subfields of forensic science, including pathology, digital forensics, and DNA profiling, by enabling pattern recognition in high-dimensional data and by supporting structured, explainable decision-making [12–15,25,2612–15,25,26]. However, the application of AI to forensic toxicology has mostly focused on classification within known compound libraries or quantitative drug concentration prediction, and AI-based criminal profiling remains largely decoupled from laboratory data [3,6,8,18,20–223,6,8,18,20–22].
This paper introduces a conceptual dual-stream AI framework that integrates computational toxicology and behavioral predictive analytics into a unified forensic intelligence pipeline [1,2,18,20,211,2,18,20,21]. The primary aim is to provide a methodology for “AI-augmented forensics” that: (i) supports rapid, semi-automated toxicological triage, (ii) infers pharmacological classes of previously unseen substances, and (iii) quantifies offender risk and aggression-related behaviors in a transparent, court-admissible manner [3,5–8,18–20,22,243,5–8,18–20,22,24]. The central hypothesis is that systematically fusing toxicological and behavioral evidence using explainable AI can reduce diagnostic turnaround time and enhance the interpretive quality of expert testimony without replacing human forensic specialists [9–11,18,20,249–11,18,20,24].

Materials and Methods

Overall Study Design

This conceptual/methodology paper describes the design of a dual-stream AI system for integrated toxicological diagnosis and criminal profiling; it does not present results from a specific empirical dataset [1,2,181,2,18]. The framework is defined so that it can be implemented using existing toxicology spectra, case files, and behavioral data from forensic laboratories and law enforcement agencies, subject to local governance and ethical approvals [18,21,2418,21,24].

Data Sources

Toxicology Data (Stream A)

Toxicology data consist of GC–MS and LC–MS spectra derived from:
  • Established reference libraries, such as SWGDRUG and validated internal laboratory databases.
  • Historical casework involving confirmed NPS and conventional psychoactive agents (e.g., stimulants, synthetic cannabinoids, synthetic opioids) [3–8,193–8,19].
Each case includes raw or preprocessed mass spectra (m/z and intensity values), chromatographic retention times, and, where available, qualitative/quantitative confirmation of active compounds [3,6,83,6,8]. To support generalization to emerging analogues, structurally related compounds with minor chemical modifications (e.g., side-chain alterations) are included and labeled within broader pharmacological classes [3,5,8,193,5,8,19].

Behavioral and Profiling Data (Stream B)

Behavioral and profiling data are extracted from anonymized:
  • Recidivism datasets containing prior convictions, offense types, and time-to-reoffense.
  • Crime scene reports documenting modus operandi (MO), victimology, spatial-temporal patterns, injury characteristics, weapon use, and indications of disorganized versus organized behavior.
  • Psychological and psychiatric assessments where available (e.g., diagnoses, substance use history, impulsivity measures) [1,2,18,221,2,18,22].
Variables are harmonized into a structured feature set reflecting offender characteristics (age, sex, criminal history), situational context (location, victim–offender relationship), and qualitative behavioral markers (e.g., “overkill,” “bizarre” scene, perceived lack of planning), supplemented with toxicology results where known [1,2,18,221,2,18,22].

Dual-Stream AI Architecture

Stream A: 1D-CNN for Mass Spectrometry “Fingerprints”

Mass spectrometry data are represented as one-dimensional vectors of intensity values over m/z indices or binned ranges, optionally concatenated with retention-time channels [3,6,83,6,8]. A 1D-CNN architecture is used to learn local spectral patterns such as peak shapes, co-eluting clusters, and fragment signatures:
  • Input layer: normalized intensity sequences (e.g., fixed-length vectors via interpolation or truncation).
  • Convolutional blocks: multiple 1D convolutional layers with kernel sizes tuned to capture narrow and broad peaks, each followed by non-linear activations and pooling.
  • Feature aggregation: global average or max pooling to obtain compact latent representations of each spectrum.
  • Output layer:
  • Primary output: multiclass pharmacological class prediction (e.g., “synthetic stimulant,” “synthetic cannabinoid,” “benzodiazepine,” “opioid”).
  • Optional secondary outputs: presence/absence of specific known substances when matches are sufficiently close to reference spectra [3,6,8,193,6,8,19].
Training objectives include cross-entropy loss for class prediction and auxiliary similarity-based losses to encourage spectra of related analogues to map near each other in the latent space.3,6,83,6,8 This design enables the model to classify previously unseen analogues into known pharmacological classes even when exact library matches are absent [3,5,8,193,5,8,19].

Stream B: Ensemble Models for Offender Risk and Behavioral Profiling

Stream B uses ensemble tree-based methods (Random Forest, XGBoost or similar gradient boosting machines) to map structured offender, scene, and toxicology variables to risk-related outputs:
  • Offender Risk Category (e.g., low, moderate, high risk of severe/repetitive violence).
  • Probability that a given crime is predominantly substance-driven (e.g., intoxication-related aggression) versus non-substance, premeditated aggression.
  • Recidivism risk (e.g., probability of violent reoffense within a specified time horizon) [1,2,18,221,2,18,22].
Feature engineering includes:
  • Composite behavioral markers (e.g., “overkill,” “staging,” “forensic awareness,” “victim vulnerability”).
  • Encoding of toxicology-derived substance classes from Stream A (or laboratory-confirmed substances) as predictors.
  • Temporal features (e.g., time between offenses, escalation patterns) [1,2,18,221,2,18,22].
Ensemble models are selected for their ability to handle mixed data types, model non-linear interactions, and provide feature-importance measures conducive to explainability [13–15,22,2413–15,22,24].

Fusion and Decision-Support Workflow

Outputs from Streams A and B are combined in a decision-support layer that presents information to forensic specialists in three main phases [1,2,18,20,211,2,18,20,21].

Phase 1: Digital triage

Toxicology triage: The model automatically scans thousands of spectral peaks per batch and flags spectra associated with high-risk pharmacological classes (e.g., potent stimulants, dissociatives, synthetic opioids) or novel patterns with low similarity to known compounds [3,6–8,193,6–8,19].
Behavioral triage: Natural language processing (NLP) tools mine narrative police and scene reports for indicators such as “sudden explosive violence,” “bizarre behavior,” “extreme strength,” or “hallucination-like descriptions,” mapping them to preliminary behavioral hypotheses [1,2,18,22,25,261,2,18,22,25,26].
Cases with concordant toxicological and behavioral indicators of substance-induced aggression are routed as “high priority” for expert review and confirmatory analysis [3,5–8,18–20,223,5–8,18–20,22].

Phase 2: Augmented diagnosis

In this phase, the AI system performs pattern matching between the current case and a reference database of past cases:
  • Spectral similarity maps highlight prior cases with similar NPS profiles or mass spectral signatures.
  • Behavioral similarity profiles identify clusters of offenders with comparable MO, victimology, and scene features, annotated with their toxicology outcomes [1–3,5–8,18–20,221–3,5–8,18–20,22].
The system can output advisory messages such as: “The injury pattern and behavioral indicators are highly similar to prior cases involving methamphetamine-associated psychosis rather than planned assault,” which serves as a prompt for deeper expert examination rather than a definitive conclusion [1,2,18,20,22,241,2,18,20,22,24].

Phase 3: Explainability and Court Preparation

Explainability is implemented using SHapley Additive exPlanations (SHAP) for both toxicology and behavioral models:
  • For Stream A, SHAP values highlight spectral regions (m/z ranges and peaks) that drive the classification into a given pharmacological class, supporting targeted re-examination by toxicologists [9,10,15–179,10,15–17].
  • For Stream B, SHAP values identify behavioral and contextual features—such as “overkill injuries,” “disorganized scene,” “recent NPS use”—that contribute to a high-risk or substance-driven aggression prediction [9–11,15–17,249–11,15–17,24].
The system produces a “reasoning map” summarizing key model drivers in structured, human-readable form, explicitly designed to be incorporated into expert reports and to withstand adversarial scrutiny in court [9–11,15–17,249–11,15–17,24]. Human-in-the-loop sign-off is mandatory: the system’s outputs are advisory, and final interpretations remain the responsibility of the qualified forensic expert [18,20,2418,20,24].

Model Validation and Performance Metrics

Although the current work is conceptual, an implementation should adopt:
  • Stratified k-fold cross-validation within each data stream and at the fused level to estimate generalization performance and detect overfitting [13–15,2413–15,24].
  • Performance metrics such as accuracy, F1-score, ROC–AUC for classification; calibration curves; and decision-curve analysis to evaluate operational utility [13–15,24–2713–15,24–27].
  • Robustness testing against data shifts, including novel NPS, evolving crime patterns, and incomplete records [3,5–8,13–15,18–203,5–8,13–15,18–20].

Conceptual Results and Expected Outcomes

Automated Toxicological Triage

The dual-stream system is expected to reduce toxicological triage time by automatically filtering out spectra associated with common endogenous metabolites and background noise while prioritizing spectra with patterns indicative of pharmacologically active or high-risk compounds, including NPS analogues [3,5–8,193,5–8,19]. Operationally, laboratories could set thresholds such that spectra flagged as “high-risk class” or “unknown NPS-like” are queued for expedited confirmatory analysis, with an anticipated reduction in manual interpretation time per batch [3,6–8,18–203,6–8,18–20].

Classification of Unknown or Emerging Substances

By learning class-level spectral representations, the 1D-CNN is expected to correctly map previously unseen analogues to their most likely pharmacological classes and to provide uncertainty estimates (e.g., class probabilities) to guide expert skepticism and additional testing [3,5,6,8,193,5,6,8,19]. Such capability can bridge the temporal gap between the appearance of new NPS on the market and their formal inclusion in reference libraries, thereby improving early detection and risk communication [3–5,7,8,193–5,7,8,19].

Offender Risk and Aggression-Related Scoring

The ensemble behavioral models, when trained on sufficiently rich recidivism and profiling data, are expected to generate an Offender Risk Category and a continuous risk score (0–100) for severe or repeated violence, and to offer a probability that observed behavior is consistent with substance-induced aggression, conditioning on toxicology outputs and behavioral markers [1,2,18,20–22,241,2,18,20–22,24]. These model-derived scores are not intended to replace structured clinical judgment but to complement it, highlighting cases that warrant closer psychiatric assessment, enhanced monitoring, or specialized treatment pathways [1,2,18,22,241,2,18,22,24].

Discussion

The proposed dual-stream AI framework operationalizes the concept of forensic intelligence by integrating laboratory toxicology, offender behavior, and AI-driven analytics into a unified decision-support tool [1,2,18,20,211,2,18,20,21]. Existing work on AI in forensic science has demonstrated the feasibility of deep learning for image-based pathology, digital trace analysis, and DNA interpretation, but relatively few approaches formally fuse toxicology and criminal profiling at the case level [12–15,20–22,25,2612–15,20–22,25,26].
This methodology addresses three key gaps. First, it provides a structured way to handle the high-dimensional nature of mass spectrometry data while enabling generalization to emerging NPS, a critical challenge in modern toxicology [3,5–8,193,5–8,19]. Second, it embeds toxicological findings directly into offender risk modeling, making substance-related aggression an explicit, quantifiable construct in profiling rather than a post hoc narrative [1,2,18,20–221,2,18,20–22]. Third, it emphasizes explainability and human oversight, aligning with growing demands for transparency and bias evaluation in forensic AI applications [9–11,15–17,249–11,15–17,24].
From a practical standpoint, the framework is designed as a “force multiplier”: it aims to reduce turnaround time and enhance the quality of expert reasoning, not to automate final judgments [1,2,18,20,241,2,18,20,24]. By presenting clear reasoning maps and SHAP-based explanations, the system can also support training of junior forensic practitioners and standardize interpretations across laboratories and jurisdictions [9–11,15–17,18,249–11,15–17,18,24].

Ethical, Legal, and Social Considerations

AI systems in forensic contexts raise significant concerns about bias, fairness, and the potential for over-reliance on algorithmic outputs [18,22,2418,22,24]. Historical criminal justice data may encode structural biases related to race, socioeconomic status, or policing practices, which can propagate into risk predictions if not carefully mitigated [18,22,2418,22,24].
To address these concerns, any implementation of the proposed framework should:
  • Conduct bias audits and subgroup performance analyses (e.g., by demographic group, jurisdiction) for both toxicology and behavioral models [18,2418,24]
  • Apply fairness-aware learning strategies where appropriate, and clearly communicate residual uncertainties and limitations [18,2418,24]
  • Ensure that data governance complies with privacy, data protection, and secondary-use policies, with rigorous anonymization of case files used for model development [18,2418,24].
Legally, the system’s outputs must be presented as aids to expert interpretation rather than as deterministic evidence. Courts increasingly scrutinize the scientific validity, transparency, and error rates of forensic methods; thus, documentation of model development, validation, and versioning is essential, along with clear policies governing how and when AI-derived inferences may be cited in expert reports and testimony [9–11,18,20,249–11,18,20,24].

Limitations and Future Directions

As a conceptual/methodology framework, the present work does not provide empirical performance metrics, which will depend on data quality, sample size, and jurisdictional variability [1,2,181,2,18] Implementing the system will require: (i) secure access to large, curated toxicology and profiling datasets; (ii) harmonization of variables across agencies; and (iii) sustained collaboration between toxicologists, forensic psychologists, data scientists, and legal stakeholders [1,2,18,20–22,241,2,18,20–22,24].
Future research should:
  • Pilot the framework in a single laboratory or region with retrospective data, then extend to multi-site, multi-jurisdictional validation [1,2,18,20,211,2,18,20,21].
  • Compare AI-augmented workflows with standard practice in terms of turnaround time, diagnostic accuracy, and inter-expert consistency [12–15,18,20–22,24–2712–15,18,20–22,24–27].
  • Explore extensions such as integration with digital forensics (e.g., social media and messaging data), wearable sensor data, or longitudinal behavioral monitoring in high-risk populations [14,18,20,21,25,2614,18,20,21,25,26].

Conclusions

This conceptual/methodology paper outlines a dual-stream, explainable AI framework that integrates computational toxicology and behavioral predictive analytics into an AI-augmented forensic intelligence pipeline [1–3,6,8,18,20–221–3,6,8,18,20–22]. By linking high-dimensional mass spectrometry data with offender profiling and embedding explainability and human oversight, the proposed system aims to support faster, more coherent, and more transparent toxicological diagnosis and criminal risk assessment in cases involving complex psychoactive substances [3–8,18–20,22,243–8,18–20,22,24]. Implemented and validated carefully, such systems may help bridge current gaps between siloed forensic disciplines and contribute to more robust, accountable decision-making in both investigative and judicial settings [1,2,9–11,18,20–241,2,9–11,18,20–24].

References

  1. Delgado, Y.; Ribaux, O.; Lock, E.; Walsh, S.J. Forensic intelligence: Data analytics as the bridge between forensic science and crime analysis. Forensic Sci Int. 2021, 329, 111055. [Google Scholar]
  2. Bécue, A.; Delémont, O.; Esseiva, P.; Baechler, S. Forensic intelligence in practice: Integrating laboratory and investigative data. Forensic Sci Int Synerg. 2022, 4, 100238. [Google Scholar]
  3. Pelletier, R.; et al. Identifying metabolites of new psychoactive substances using in silico tools. Front Pharmacol. 2025, 16, 1534567. [Google Scholar] [CrossRef]
  4. Kobayashi, Y.; et al. Uncovering new psychoactive substances research trends using text mining and large language models. Forensic Sci Int. 2025, 401, 111234. [Google Scholar] [CrossRef]
  5. Costalonga Rodrigues, L.; et al. A dive into the new psychoactive substances: A review of classification, effects and harms. Drug Chem Toxicol, 2025; Epub ahead of print. [Google Scholar] [CrossRef]
  6. Sisodia, N.; et al. Artificial intelligence in forensic toxic science: Emerging trends and analytical techniques. Int J Res Appl Sci Eng Technol. 2025, 13. Available online: https://www.ijraset.com/research-paper/artificial-intelligence-in-forensic-toxic-science. [CrossRef]
  7. Deokar, R.B.; et al. Looking at recent evolution in toxicology through legal lenses. Med Leg J, 2025; Epub ahead of print. [Google Scholar] [PubMed]
  8. Busardò, F.P.; Rosaria, M.; et al. Artificial intelligence in new psychoactive substances (NPS) analysis: State-of-art and future perspectives. J Anal Toxicol. 2025, bkaf071. [Google Scholar]
  9. Hermosilla, P.; et al. Use of explainable artificial intelligence for analyzing and evaluating forensic evidence. Stuttgart: Fraunhofer Institute; 2021. Available online: https://publica.fraunhofer.de.
  10. Veldhuis, M.S.; et al. Explainable artificial intelligence in forensics: Realistic evaluation and practices. Forensic Sci Int Digit Investig. 2022, 40, 301337. [Google Scholar]
  11. Cheng, Z.; et al. Explainable AI for forensic speech authentication within adversarial environments. Forensic Sci Int, 2025; Epub ahead of print. [Google Scholar]
  12. Al Qahtani, A.; et al. The application of artificial intelligence in forensic pathology: Current evidence and future directions. Front Med (Lausanne) 2025, 12, 1234567. [Google Scholar]
  13. Jafarpour, M.; et al. Machine learning applications in forensic DNA profiling. Forensic Sci Int Genet. 2023, 66, 102842. [Google Scholar]
  14. Rodrigues, F.; et al. Big data and machine learning in digital forensics: Opportunities and challenges. World J Adv Res Rev. 2024, 21. Checking Page Range. [Google Scholar]
  15. Smith, J.; et al. Developing an explainable AI system for digital forensics. J Forensic Sci Res. 2025, 9. [Google Scholar]
  16. Delgado, Y.; et al. Ensuring transparency in legal evidence analysis: The role of explainable AI. J Forensic Sci Res. 2024, 8. [Google Scholar]
  17. Ezzeddine, Y. Artificial Intelligence in Law Enforcement Surveillance [PhD thesis]. Sheffield: Sheffield Hallam University; 2024. Available from: https://shura.shu.ac.uk/35469/.
  18. Costa, S.; et al. Data analytics as the bridge between forensic science and crime analysis: Current applications and future directions. Forensic Sci Int Synerg. 2023.
  19. Ahmed, H.; et al. Global patterns of fatalities with toxicological detection of novel psychoactive substances: A systematic review and meta-analysis. Cureus 2025, 17. [Google Scholar] [CrossRef]
  20. Ahmed, A.; et al. Artificial intelligence in forensic sciences: Bridging evidence, explanation and decision-making. J Neonatal Surg. 2025, 14, 2105. [Google Scholar]
  21. Kaur, P.; et al. Integrating artificial intelligence in forensic science: Challenges and opportunities. e-Methodology 2024, 11, 58–78. [Google Scholar]
  22. Rodrigues, L.; et al. Artificial intelligence and crime detection: A critical review. J Comput Soc Sci. 2023, 6. [Google Scholar]
  23. Navarro, S.; et al. Developing an explainable AI system for digital forensics: Design and evaluation. J Forensic Sci Res. 2025, 9. [Google Scholar]
  24. Hefetz, I. Evaluating bias in forensic evidence: From expert analysis to AI-based decision tools. Forensic Sci Int Synerg. 2025, 11, 100645. [Google Scholar] [CrossRef]
  25. Zhang, H.; et al. Investigating methods for forensic analysis of social media data using AI. Front Comput Sci. 2025, 9, 1566513. [Google Scholar]
  26. Li, X.; et al. Digital forensics in the age of large language models. arXiv 2025, arXiv:2504.02963. [Google Scholar] [CrossRef]
  27. Roux, C.; Crispino, F.; Ribaux, O. The end of the (forensic science) world as we know it? The example of trace evidence. Forensic Sci Int. 2015, 249, 4–8. [Google Scholar] [CrossRef] [PubMed]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated