Spec2Xtract is an R package to extract MS2 spectra from .mzML files or directly from .raw files acquired on Thermo instrument by using mzR or rawrr respectively. The package is compatible with multi-energy acquisition, extracting one spectra for every energy used in the file. Spectra annotation can be performed by using Spec2Annot. An implementation on Galaxy of those functionalities is under progress.
[x] Main steps
[x] targets pipeline as a factory
[x] Documentation
[x] Tests
[ ] Vignette
You can install the development version of Spec2Xtract from GitHub using your prefered method:
method | command |
---|---|
remotes | remotes::install_github("odisce/Spec2Xtract") |
devtools | devtools::install_github("odisce/Spec2Xtract") |
renv | renv::install("github::odisce/Spec2Xtract") |
pak | pak::pkg_install("odisce/Spec2Xtract") |
Spec2Xtract use rawrr ro read .raw files directly. To be used it needs to be installed with the following commands:
rawrr::installRawFileReaderDLLs()
rawrr::installRawrrExe()
To extract the MSn library, Spec2Xtract needs:
- a folder path (ex: /path/to/raw/directory/) with the .raw or .mzML file(s). Any MSn acquisition is authorized in FIA mode or combined with LC or GC.
- a table with the neutral elemental composition (elemcomposition) of the molecules to extract (compound) and optionnaly the retention time (rtsec) (accepted formats .txt, .csv, .tsv, .xlsx). Any other column are optional and will be added in the final .msp library.
compound elemcomposition rtsec inchikey Alanine C3H7NO2 3.15 QNAYBMKLOCPYGJ-REOHCLBHSA-N CustomName C4H21S1P3 1.52
Spec2Xtract will extract every MSn events in range of a chromatographic peaks belonging to each of the molecules in the input table. It will automatically calculate the correct ions masses from the elemental composition with Spec2Annot and ouptut the results in a new folder ./report with the following structure:
└─ R-Project
├─ report
│ ├─ figures
| | ├─ spectras # Contains all extracted spectra in image format
| | | ├─ CPD1_File1_SPEC1.png
| | | └─ ...
| | └─ xics # Contains all XICs with the detected peak windows
| | | ├─ CPD1.png
| | | └─ ...
│ ├─ spectra
| | ├─ msp # Contains the library in .msp format
| | | └─ library.msp
| | └─ xlsx # Contains all extracted spectra in .xlsx format
| | | ├─ CPD1_File1_SPEC1.xlsx
| | | └─ ...
│ ├─ EventSummary.xlsx # Summary table of each MSn event extracted
│ └─ Summary.xlsx # Summary table of each initial compounds
├─ _targets # targets cache
└─ _targets.R # targets pipeline
Spec2Xtract provide a wrapper which will initialize and run all the analysis with one function call:
- create a new project
- open an R session inside
- install Spec2Xtract and rawrr (see Installation)
- run the following command:
Spec2Xtract::run_Spec2Xtract( files_dir = "/path/to/raw/directory/", cpd_path = "/path/to/cpd_info.xlsx", firstevent = TRUE, prec_ppm = 5, minscan = 3, rt_limit = 0.2, ppm = 6, save_dir = "./report", filter_irel = 0.01, filter_isopurity = 10, ncore = 1 )
- To get help:
?Spec2Xtract::run_Spec2Xtract
For targets users, Spec2Xtract provide a target factory to run a full analysis pipeline from a list of compound(s) and a list of path to .raw file(s) as shown below. targets is a great package to manage analytical workflow and this implementation leverage all the functionalities it provides (caching, vizualisation, CPUs parallelization, HPC deployment, etc...).
First, create a new folder to store your project and open an R session inside:
## Load Spec2Xtract library
require("Spec2Xtract")
## Here we use a temporary directory
temp_wd <- tempdir()
setwd(temp_wd)
Then we use the targets::tar_script()
function to write the _targets.R
script with thetarget_Spec2Xtract()
factory:
## Write targets pipeline using the target_Spec2Xtract() factory
targets::tar_script(
{
require(Spec2Xtract)
require(crew)
## The next option set the number of parallel workers to use
tar_option_set(
controller = crew_controller_local(workers = 2)
)
list(
temp_tar <- target_Spec2Xtract(
## files should be the path(s) to the '.raw' files
files = get_sample_rawfile(),
## cpd should be the compound table to extract (see the example)
cpd = Spec2Xtract:::example_cpdlist_realdt,
firstevent = TRUE,
prec_ppm = 10,
minscan = 3,
rt_limit = 2,
ppm = 10,
save_dir = "./report"
)
)
},
ask = FALSE
)
Finally the pipeline can be run like this:
## Run the pipeline
targets::tar_make()
The results will be stored in the ./report
folder.