Dataset: "Joint structural annotation of small molecules using liquid chromatography retention order and tandem mass spectrometry data"
Kuvaus
Dataset used in the experiments of the publication: "Joint structural annotation of small molecules using liquid chromatography retention order and tandem mass spectrometry data" by Bach et al.
File description:
cfmid4.tar: MS² spectra simulated using CFM-ID (v4.0.7) for all molecular candidate structures
db_layout.png: Visualization of the SQLite database (DB) layout
massbank.sqlite.gz: DB containing all needed data to (re-)run the experiments shown in the paper. Please read "DB_README.md" for further details. The database file can be unpacked using gzip.
metfrag.tar: MetFrag input files and MS² scores for all candidate sets computed using the MetFrag software.
sirius_scores.tar: MS² scores for all candidates and measured spectra using the SIRIUS software.
sirius_inputs.tar: Input (ms-files) for the SIRIUS software.
DB_README.md: Description of each table in the "massbank.sqlite" SQLite DB.
db_processing_scripts.tar: Scripts to re-produce the "massbank.sqlite" and a README.md providing further information on the process.
massbank__2020.11__v0.6.1.sqlite: Base SQLite DB from which the "massbank.sqlite" was build up. It was created using the "massbank2db" (v0.6.1) Python package using the MassBank release 2020.11.
substructure_fingerprints.tar: Pre-computed substructure counting fingerprints for all candidates related to our experiments.
Instructions:
The "massbank.sqlite" can be directly used with the Structure Support Vector Machine Model (SSVM) described in the manuscript and implemented in the "ssvm" Python package.
If desired, the database can be re-produced using the scripts provided in "db_processing_scripts.tar":
Create a directory for all dataDownload and extract the ...
Processing scriptsMS² scorer outputs (e.g. metfrag.tar)Pre-computed substructure fingerprints
Follow the instructions given in the "README.md" of the "db_processing_scripts.tar"
Näytä enemmänJulkaisuvuosi
2022
Aineiston tyyppi
Projekti
Muut tiedot
Tieteenalat
Tietojenkäsittely ja informaatiotieteet
Kieli
Saatavuus
Avoin