Preprint Communication Version 1 Preserved in Portico This version is not peer-reviewed

Taxonomy-focused Natural Product Databases for Carbon-13 NMR-based Dereplication

Version 1 : Received: 27 May 2021 / Approved: 28 May 2021 / Online: 28 May 2021 (12:59:37 CEST)

A peer-reviewed article of this Preprint also exists.

{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2022,12,23]],"date-time":"2022-12-23T09:17:05Z","timestamp":1671787025386},"reference-count":34,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2021,6,28]],"date-time":"2021-06-28T00:00:00Z","timestamp":1624838400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Analytica"],"abstract":"The recent revival of the study of organic natural products as renewable sources of medicinal drugs, cosmetics, dyes, and materials motivated the creation of general purpose structural databases. Dereplication, the efficient identification of already reported compounds, relies on the grouping of structural, taxonomic and spectroscopic databases that focus on a particular taxon (species, genus, family, order, etc.). A set of freely available python scripts, CNMR_Predict, is proposed for the quick supplementation of taxon oriented search results from the naturaL prOducTs occUrrences database (LOTUS, lotus.naturalproducts.net) with predicted carbon-13 nuclear magnetic resonance data from the ACD\/Labs CNMR predictor and DB software (acdlabs.com) to provide easily searchable databases. The database construction process is illustrated using Brassica rapa as a taxon example.<\/jats:p>","DOI":"10.3390\/analytica2030006","type":"journal-article","created":{"date-parts":[[2021,6,28]],"date-time":"2021-06-28T15:45:40Z","timestamp":1624895140000},"page":"50-56","source":"Crossref","is-referenced-by-count":2,"title":["Taxonomy-Focused Natural Product Databases for Carbon-13 NMR-Based Dereplication"],"prefix":"10.3390","volume":"2","author":[{"ORCID":"http:\/\/orcid.org\/0000-0002-5120-2556","authenticated-orcid":false,"given":"Jean-Marc","family":"Nuzillard","sequence":"first","affiliation":[]}],"member":"1968","published-online":{"date-parts":[[2021,6,28]]},"reference":[{"key":"ref1","doi-asserted-by":"publisher","DOI":"10.1021\/np50070a014"},{"key":"ref2","doi-asserted-by":"publisher","DOI":"10.1007\/s11101-015-9448-7"},{"key":"ref3","doi-asserted-by":"publisher","DOI":"10.1186\/s13321-020-00424-9"},{"key":"ref4","unstructured":"COCONUT: Natural Products OnlineCoconut.naturalproducts.net"},{"key":"ref5","doi-asserted-by":"publisher","DOI":"10.1186\/s13321-020-00478-9"},{"key":"ref6","unstructured":"LOTUS: Natural Products OnlineLotus.naturalproducts.net"},{"key":"ref7","doi-asserted-by":"publisher","DOI":"10.1101\/2021.02.28.433265"},{"key":"ref8","doi-asserted-by":"publisher","DOI":"10.1016\/j.cofs.2021.02.009"},{"key":"ref9","doi-asserted-by":"publisher","DOI":"10.1038\/s41596-020-0317-5"},{"key":"ref10","doi-asserted-by":"publisher","DOI":"10.1021\/ac403223f"},{"key":"ref11","doi-asserted-by":"publisher","DOI":"10.1016\/j.pnmrs.2020.12.001"},{"key":"ref12","doi-asserted-by":"publisher","DOI":"10.1021\/acs.jnatprod.6b01063"},{"key":"ref13","doi-asserted-by":"publisher","DOI":"10.1021\/acs.analchem.0c00193"},{"key":"ref14","doi-asserted-by":"publisher","DOI":"10.1002\/ejoc.201901878"},{"key":"ref15","doi-asserted-by":"publisher","DOI":"10.1177\/1934578X21996166"},{"key":"ref16","doi-asserted-by":"publisher","DOI":"10.1016\/j.tetasy.2017.09.009"},{"key":"ref17","doi-asserted-by":"publisher","DOI":"10.3390\/molecules26113413"},{"key":"ref18","doi-asserted-by":"publisher","DOI":"10.1016\/j.fitote.2018.10.003"},{"key":"ref19","doi-asserted-by":"publisher","DOI":"10.3390\/molecules26030637"},{"key":"ref20","unstructured":"KNApSAcK Family Top PageKnapsackfamily.com"},{"key":"ref21","doi-asserted-by":"publisher","DOI":"10.1039\/D1NP00023C"},{"key":"ref22","unstructured":"Chemistry Software for Analytical and Chemical Knowledge ManagementAcdlabs.com"},{"key":"ref23","unstructured":"Anaconda | The World\u2019s Most Popular Data Science PlatformAnaconda.com"},{"key":"ref24","unstructured":"RDKitRdkit.org"},{"key":"ref25","unstructured":"nuzillard\/KnapsackSearch: Automated Data Search in the KNApSAcK DatabaseGithub.com\/nuzillard\/KnapsackSearch"},{"key":"ref26","unstructured":"ctfiles.book\u2014ctfile.pdfDaylight.com\/meetings\/mug05\/Kappler\/ctfile.pdf"},{"key":"ref27","doi-asserted-by":"publisher","DOI":"10.1186\/s13321-015-0068-4"},{"key":"ref28","unstructured":"Blue Book-IUPAC|International Union of Pure and Applied ChemistryIupac.org\/what-we-do\/books\/bluebook"},{"key":"ref29","unstructured":"Daylight Theory: SMARTS-A Language for Describing Molecular PatternsDaylight.com\/dayhtml\/doc\/theory\/theory.smarts.html"},{"key":"ref30","unstructured":"NatExploreNat-explore.com"},{"key":"ref31","doi-asserted-by":"publisher","DOI":"10.1016\/j.phytochem.2004.08.027"},{"key":"ref32","series-title":"Personal Communication","author":"Richomme","year":"2020"},{"key":"ref33","doi-asserted-by":"publisher","DOI":"10.1021\/acs.jnatprod.7b00776"},{"key":"ref34","unstructured":"Empowering Innovation & Scientific DiscoveriesCas.org"}],"container-title":["Analytica"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2673-4532\/2\/3\/6\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,6,28]],"date-time":"2021-06-28T15:51:52Z","timestamp":1624895512000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2673-4532\/2\/3\/6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,6,28]]},"references-count":34,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2021,9]]}},"alternative-id":["analytica2030006"],"URL":"http:\/\/dx.doi.org\/10.3390\/analytica2030006","relation":{},"ISSN":["2673-4532"],"issn-type":[{"value":"2673-4532","type":"electronic"}],"published":{"date-parts":[[2021,6,28]]}}} {"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2022,12,23]],"date-time":"2022-12-23T09:17:05Z","timestamp":1671787025386},"reference-count":34,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2021,6,28]],"date-time":"2021-06-28T00:00:00Z","timestamp":1624838400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Analytica"],"abstract":"The recent revival of the study of organic natural products as renewable sources of medicinal drugs, cosmetics, dyes, and materials motivated the creation of general purpose structural databases. Dereplication, the efficient identification of already reported compounds, relies on the grouping of structural, taxonomic and spectroscopic databases that focus on a particular taxon (species, genus, family, order, etc.). A set of freely available python scripts, CNMR_Predict, is proposed for the quick supplementation of taxon oriented search results from the naturaL prOducTs occUrrences database (LOTUS, lotus.naturalproducts.net) with predicted carbon-13 nuclear magnetic resonance data from the ACD\/Labs CNMR predictor and DB software (acdlabs.com) to provide easily searchable databases. The database construction process is illustrated using Brassica rapa as a taxon example.","DOI":"10.3390\/analytica2030006","type":"journal-article","created":{"date-parts":[[2021,6,28]],"date-time":"2021-06-28T15:45:40Z","timestamp":1624895140000},"page":"50-56","source":"Crossref","is-referenced-by-count":2,"title":["Taxonomy-Focused Natural Product Databases for Carbon-13 NMR-Based Dereplication"],"prefix":"10.3390","volume":"2","author":[{"ORCID":"http:\/\/orcid.org\/0000-0002-5120-2556","authenticated-orcid":false,"given":"Jean-Marc","family":"Nuzillard","sequence":"first","affiliation":[]}],"member":"1968","published-online":{"date-parts":[[2021,6,28]]},"reference":[{"key":"ref1","doi-asserted-by":"publisher","DOI":"10.1021\/np50070a014"},{"key":"ref2","doi-asserted-by":"publisher","DOI":"10.1007\/s11101-015-9448-7"},{"key":"ref3","doi-asserted-by":"publisher","DOI":"10.1186\/s13321-020-00424-9"},{"key":"ref4","unstructured":"COCONUT: Natural Products OnlineCoconut.naturalproducts.net"},{"key":"ref5","doi-asserted-by":"publisher","DOI":"10.1186\/s13321-020-00478-9"},{"key":"ref6","unstructured":"LOTUS: Natural Products OnlineLotus.naturalproducts.net"},{"key":"ref7","doi-asserted-by":"publisher","DOI":"10.1101\/2021.02.28.433265"},{"key":"ref8","doi-asserted-by":"publisher","DOI":"10.1016\/j.cofs.2021.02.009"},{"key":"ref9","doi-asserted-by":"publisher","DOI":"10.1038\/s41596-020-0317-5"},{"key":"ref10","doi-asserted-by":"publisher","DOI":"10.1021\/ac403223f"},{"key":"ref11","doi-asserted-by":"publisher","DOI":"10.1016\/j.pnmrs.2020.12.001"},{"key":"ref12","doi-asserted-by":"publisher","DOI":"10.1021\/acs.jnatprod.6b01063"},{"key":"ref13","doi-asserted-by":"publisher","DOI":"10.1021\/acs.analchem.0c00193"},{"key":"ref14","doi-asserted-by":"publisher","DOI":"10.1002\/ejoc.201901878"},{"key":"ref15","doi-asserted-by":"publisher","DOI":"10.1177\/1934578X21996166"},{"key":"ref16","doi-asserted-by":"publisher","DOI":"10.1016\/j.tetasy.2017.09.009"},{"key":"ref17","doi-asserted-by":"publisher","DOI":"10.3390\/molecules26113413"},{"key":"ref18","doi-asserted-by":"publisher","DOI":"10.1016\/j.fitote.2018.10.003"},{"key":"ref19","doi-asserted-by":"publisher","DOI":"10.3390\/molecules26030637"},{"key":"ref20","unstructured":"KNApSAcK Family Top PageKnapsackfamily.com"},{"key":"ref21","doi-asserted-by":"publisher","DOI":"10.1039\/D1NP00023C"},{"key":"ref22","unstructured":"Chemistry Software for Analytical and Chemical Knowledge ManagementAcdlabs.com"},{"key":"ref23","unstructured":"Anaconda | The World\u2019s Most Popular Data Science PlatformAnaconda.com"},{"key":"ref24","unstructured":"RDKitRdkit.org"},{"key":"ref25","unstructured":"nuzillard\/KnapsackSearch: Automated Data Search in the KNApSAcK DatabaseGithub.com\/nuzillard\/KnapsackSearch"},{"key":"ref26","unstructured":"ctfiles.book\u2014ctfile.pdfDaylight.com\/meetings\/mug05\/Kappler\/ctfile.pdf"},{"key":"ref27","doi-asserted-by":"publisher","DOI":"10.1186\/s13321-015-0068-4"},{"key":"ref28","unstructured":"Blue Book-IUPAC|International Union of Pure and Applied ChemistryIupac.org\/what-we-do\/books\/bluebook"},{"key":"ref29","unstructured":"Daylight Theory: SMARTS-A Language for Describing Molecular PatternsDaylight.com\/dayhtml\/doc\/theory\/theory.smarts.html"},{"key":"ref30","unstructured":"NatExploreNat-explore.com"},{"key":"ref31","doi-asserted-by":"publisher","DOI":"10.1016\/j.phytochem.2004.08.027"},{"key":"ref32","series-title":"Personal Communication","author":"Richomme","year":"2020"},{"key":"ref33","doi-asserted-by":"publisher","DOI":"10.1021\/acs.jnatprod.7b00776"},{"key":"ref34","unstructured":"Empowering Innovation & Scientific DiscoveriesCas.org"}],"container-title":["Analytica"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2673-4532\/2\/3\/6\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,6,28]],"date-time":"2021-06-28T15:51:52Z","timestamp":1624895512000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2673-4532\/2\/3\/6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,6,28]]},"references-count":34,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2021,9]]}},"alternative-id":["analytica2030006"],"URL":"http:\/\/dx.doi.org\/10.3390\/analytica2030006","relation":{},"ISSN":["2673-4532"],"issn-type":[{"value":"2673-4532","type":"electronic"}],"published":{"date-parts":[[2021,6,28]]}}}

DOI: 10.3390/analytica2030006

Abstract

The recent revival of the study of organic natural products as renewable sources of medicinal drugs, cosmetics, dyes, and materials motivated the creation of general-purpose structural databases. Dereplication, the efficient identification of already reported compounds, relies on the grouping of structural, taxonomic and spectroscopic databases that focus on a particular taxon (species, genus, family, order…). A set of freely available python scripts, CNMRPredict, is proposed for the quick supplementation of taxon-oriented search results from the LOTUS database (lotus.naturalproducts.net) with predicted carbon-13 NMR data from the ACD/Labs (acdlabs.com) CNMR predictor and DB software to provide easily searchable databases. The database construction process is illustrated using Brassica rapa as taxon example.

Supplementary and Associated Material

Keywords

Natural products; databases; dereplication; taxonomy; NMR

Subject

CHEMISTRY, Analytical Chemistry

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.