Analyze Diet
Drug testing and analysis2020; 13(5); 990-1000; doi: 10.1002/dta.2976

Towards compound identification of synthetic opioids in nontargeted screening using machine learning techniques.

Abstract: The constant evolution of the illicit drug market makes the identification of unknown compounds problematic. Obtaining certified reference materials for a broad array of new analogues can be difficult and cost prohibitive. Machine learning provides a promising avenue to putatively identify a compound before confirmation against a standard. In this study, machine learning approaches were used to develop class prediction and retention time prediction models. The developed class prediction model used a naïve Bayes architecture to classify opioids as belonging to either the fentanyl analogues, AH series or U series, with an accuracy of 89.5%. The model was most accurate for the fentanyl analogues, most likely due to their greater number in the training data. This classification model can provide guidance to an analyst when determining a suspected structure. A retention time prediction model was also trained for a wide array of synthetic opioids. This model utilised Gaussian process regression to predict the retention time of analytes based on multiple generated molecular features with 79.7% of the samples predicted within ±0.1 min of their experimental retention time. Once the suspected structure of an unknown compound is determined, molecular features can be generated and input for the prediction model to compare with experimental retention time. The incorporation of machine learning prediction models into a compound identification workflow can assist putative identifications with greater confidence and ultimately save time and money in the purchase and/or production of superfluous certified reference materials.
Publication Date: 2020-12-09 PubMed ID: 33207086DOI: 10.1002/dta.2976Google Scholar: Lookup
The Equine Research Bank provides access to a large database of publicly available scientific literature. Inclusion in the Research Bank does not imply endorsement of study methods or findings by Mad Barn.
  • Journal Article

Summary

This research summary has been generated with artificial intelligence and may contain errors and omissions. Refer to the original study to confirm details provided. Submit correction.

The research paper focuses on the use of machine learning techniques to identify unknown compounds, specifically synthetic opioids, in a cost-effective and efficient way. By predicting classifications and retention time of these illicit substances, researchers were able to increase the accuracy and speed of identification, thereby reducing the need for expensive reference materials.

Use of Machine Learning for Compound Identification

  • Given the dynamic nature of the illicit drug market, the identification of new, unknown compounds can present significant challenges. Traditional methods often involve the production or purchase of certified reference materials for each new analogue, a process which can be time- and cost- intensive.
  • This study explored the potential of machine learning techniques to identify compounds, specifically synthetic opioids, which can streamline the process and reduce the associated costs.

Development of Prediction Models

  • The research team used machine learning approaches to develop two models – one for class prediction and another for retention time prediction.
  • The class prediction model, built on a naïve Bayes architecture, was used to classify opioids into one of three categories: fentanyl analogues, the AH series, or the U series, achieving an accuracy rate of 89.5%. The highest accuracy was observed for the fentanyl analogues, likely due to the greater prevalence of these substances in the training data used to develop the model.
  • The researchers also developed a retention time prediction model. The model used a technique known as Gaussian process regression to predict the retention time of the opioids based on various generated molecular features. The model’s predictions were within ±0.1 minutes of the actual retention time for 79.7% of the testing samples.

Benefits of Machine Learning Models

  • These machine learning models can provide significant benefits when incorporated into a compound identification workflow.
  • The class prediction model can guide analysts in determining the suspected structure of a compound. Once the structure is identified, the retention time prediction model can be utilised. The molecular features of the compound are input into the model, which then predicts the retention time for comparison with empirical data.
  • This process can lead to more confident and rapid identifications, thereby saving both time and money, as there is less reliance on the production or procurement of certified reference materials, which can be expensive and time-consuming to produce.

Cite This Article

APA
Klingberg J, Cawley A, Shimmon R, Fu S. (2020). Towards compound identification of synthetic opioids in nontargeted screening using machine learning techniques. Drug Test Anal, 13(5), 990-1000. https://doi.org/10.1002/dta.2976

Publication

ISSN: 1942-7611
NlmUniqueID: 101483449
Country: England
Language: English
Volume: 13
Issue: 5
Pages: 990-1000

Researcher Affiliations

Klingberg, Joshua
  • Centre for Forensic Science, University of Technology Sydney, Ultimo, New South Wales, Australia.
Cawley, Adam
  • Racing NSW, Australian Racing Forensic Laboratory, Sydney, New South Wales, Australia.
Shimmon, Ronald
  • Centre for Forensic Science, University of Technology Sydney, Ultimo, New South Wales, Australia.
Fu, Shanlin
  • Centre for Forensic Science, University of Technology Sydney, Ultimo, New South Wales, Australia.

MeSH Terms

  • Analgesics, Opioid / analysis
  • Analgesics, Opioid / chemical synthesis
  • Animals
  • Chromatography, High Pressure Liquid
  • Fentanyl / analogs & derivatives
  • Fentanyl / analysis
  • Fentanyl / chemical synthesis
  • Horses / blood
  • Machine Learning
  • Molecular Structure
  • Reproducibility of Results
  • Spectrometry, Mass, Electrospray Ionization
  • Structure-Activity Relationship
  • Substance Abuse Detection
  • Tandem Mass Spectrometry

Grant Funding

  • Australian Government Research Training Program Scholarship

References

This article includes 36 references
  1. Klingberg J, Cawley A, Shimmon R, Fouracre C, Pasin D, Fu S. Finding the proverbial needle: Non-targeted screening of synthetic opioids in equine plasma.. Drug Test Anal 2020:1-13.
    doi: 10.1002/dta.2893google scholar: lookup
  2. Klingberg J, Cawley A, Shimmon R. Studies of synthetic opioids for non-targeted analysis.. Front Chem 2019;7(331).
  3. Pasin D, Cawley A, Bidny S, Fu SL. Current applications of high-resolution mass spectrometry for the analysis of new psychoactive substances: a critical review.. Anal Bioanal Chem 2017;409(25):5821-5836.
  4. Noble C, Dalsgaard PW, Johansen SS, Linnet K. Application of a screening method for fentanyl and its analogues using UHPLC-QTOF-MS with data-independent acquisition (DIA) in MSE mode and retrospective analysis of authentic forensic blood samples.. Drug Test Anal 2017;10(4):651-662.
  5. Pasin D, Cawley A, Bidny S, Fu S. Characterization of hallucinogenic phenethylamines using high-resolution mass spectrometry for non-targeted screening purposes.. Drug Test Anal 2017;9(10):1620-1629.
  6. Anstett A, Chu F, Alonso DE, Smith RW. Characterization of 2C-phenethylamines using high-resolution mass spectrometry and Kendrick mass defect filters.. Forensic Chem 2018;7:47-55.
  7. Margagliotti G, Bollé T. Machine learning & forensic science.. Forensic Sci Int 2019;298:138-139.
  8. MATLAB. Introducing machine learning.. Machine Learning With MATLAB Online. MathWorks; 2016:92991v00.
  9. Mitchell TM. Machine Learning.. New York: McGraw-Hill; 1997.
  10. MATLAB. Applying supervised learning.. Machine Learning With MATLAB Online. MathWorks; 2016:80827v00.
  11. MATLAB. Applying unsupervised learning.. Machine Learning With MATLAB Online. MathWorks; 2016 80823v00.
  12. Ekins S. Computational Toxicology: Risk Assessment for Chemicals.. John Wiley & Sons, Incorporated: Newark, United States; 2018.
  13. Luechtefeld T, Rowlands C, Hartung T. Big-data and machine learning to revamp computational toxicology and its use in risk assessment.. Toxicol Res-UK 2018;7(5):732-744.
  14. Pyke JS, Black G, Chen K, Anumol T, Young TM. Simultaneous Targeted Quantitation and Suspect Screening of Environmental Contaminants in Sewage SludgeOnline.. Agilent Technologies 2019 5994-0750EN.
  15. Miller TH, Musenga A, Cowan DA, Barron LP. Prediction of chromatographic retention time in high-resolution anti-doping screening data using artificial neural networks.. Anal Chem 2013;85(21):10330-10337.
  16. Mollerup CB, Mardal M, Dalsgaard PW, Linnet K, Barron LP. Prediction of collision cross section and retention time for broad scope screening in gradient reversed-phase liquid chromatography-ion mobility-high resolution accurate mass spectrometry.. J Chromatogr a 2018;1542:82-88.
  17. ChemAxon. Chemicalize.. 2020.
  18. Wei YC. PaDEL-descriptor.. 2014.
  19. Lambert JB. Organic Structural Spectroscopy.. 2nd ed. Upper Saddle River, N.J: Pearson Prentice Hall; 2011.
  20. . Molecular Descriptors Guide.. In: U.S. Environmental Protection Agency, ed. 1.0.2 ed. Online 2008.
  21. Ozdemir S. Principles of Data Science.. Packt Publishing; 2016.
  22. Koehrsen W. A Conceptual explanation of Bayesian hyperparameter optimization for machine learning.. Towards Data Science 2018.
  23. James G, Witten D, Hastie T, Tibshirani R. Resampling methods.. In: An Introduction to Statistical Learning: With Applications in R. New York, NY: Springer New York; 2013:175-201 978-1-4614-7138-7.
  24. Huilgol P. Accuracy vs. F1 score.. Analytics Vidhya 2019.
  25. Cichosz P. Naive Bayes classifier.. In: Data Mining Algorithms: Explained Using R. Somerset, United Kingdom: John Wiley & Sons, Incorporated; 2015:9781118950807.
  26. Gandhi R. Naive Bayes classifier.. Towards Data Science 2018.
  27. Webb GI, Bayes N. In: Sammut C, Webb GI, eds. Encyclopedia of Machine Learning and Data Mining. Boston, MA: Springer US; 2017:895-896 978-1-4899-7687-1.
  28. United Nations Office on Drugs and Crime. World Drug Report 2019.. Vienna: United Nations Office on Drugs and Crime; June 2019.
  29. Blanckaert P, Cannaert A, Van Uytfanghe K. Report on a novel emerging class of highly potent benzimidazole NPS opioids: chemical and in vitro functional characterization of isotonitazene.. Drug Test Anal 2020;12(4):422-430.
  30. Chicco D, Jurman G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation.. BMC Genomics 2020;21(1):6.
  31. Bekkar M, Djemaa HK, Alitouche TA. Evaluation measures for models assessment over imbalanced data sets.. J Informa Eng Appl 2013;3(10):27-38.
  32. Association of Official Racing Chemists. AORC guidelines for the minimum criteria for identification by chromatography and mass spectrometry.. Online 2016.
  33. Rajdeep D, Manpreet Singh G, Nick P. Mean squared error and root mean squared error.. In: Machine Learning with Spark-Second Edition. Packt Publishing; 2017:1785889931.
  34. Schulz E, Speekenbrink M, Krause A. A tutorial on Gaussian process regression: modelling, exploring, and exploiting functions.. J Math Psychol 2018;85:1-16.
  35. Rasmussen CE. Gaussian Processes for Machine Learning.. Cambridge, Mass: MIT Press; 2006.
  36. Yang D, Zhang X, Pan R, Wang Y, Chen Z. A novel Gaussian process regression model for state-of-health estimation of lithium-ion battery using charging curve.. J Power Sources 2018;384:387-395.

Citations

This article has been cited 2 times.
  1. Klingberg J, Keen B, Cawley A, Pasin D, Fu S. Developments in high-resolution mass spectrometric analyses of new psychoactive substances. Arch Toxicol 2022 Apr;96(4):949-967.
    doi: 10.1007/s00204-022-03224-2pubmed: 35141767google scholar: lookup
  2. Letourneau DR, Marzullo BP, Alexandridou A, Barrow MP, O'Connor PB, Volmer DA. Characterizing lignins from various sources and treatment processes after optimized sample preparation techniques and analysis via ESI-HRMS and custom mass defect software tools. Anal Bioanal Chem 2023 Nov;415(27):6663-6675.
    doi: 10.1007/s00216-023-04942-xpubmed: 37714972google scholar: lookup