Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

precursorMZ missing #18

Open
chufz opened this issue Sep 21, 2022 · 13 comments
Open

precursorMZ missing #18

chufz opened this issue Sep 21, 2022 · 13 comments

Comments

@chufz
Copy link
Collaborator

chufz commented Sep 21, 2022

Hi Johannes,
I finally had another look on the package with some first data. Unfortunately, the precursor mass is NA for MS2 spectra...

´´´
ftest <- system.file("ddaPASEF.d", package = "MsBackendTimsTof")
sps <- Spectra(fl, source = MsBackendTimsTof())
ms2_sps <- Spectra::filterMsLevel(sps, 2)
ms2_sps[1]
´´´
MSn data (Spectra) with 1 spectra in a MsBackendTimsTof backend:
msLevel precursorMz polarity

1 2 NA 1
... 35 more variables/columns.
Use 'spectraVariables' to list all of them.
Processing:

@jorainer
Copy link
Member

Hm, that's interesting - and you are sure that the precursor m/z are provided/stored in the file? We actually don't do anything magic, just reading the files using the timstof R package.

@chufz
Copy link
Collaborator Author

chufz commented Sep 23, 2022

I finally found the precursor masses, however, opentimsr is currently not accessing them.
However, in the *.tdf file, it contains Indices with PasefFrameMsMsInfo>IsolationMz (or PrmFrameMsMsinfo>IsolationMz) where the Frame ID and the Scannumber range assigned are given.
image

@jorainer
Copy link
Member

Could you maybe open an issue in the opentimsr repo? Would be better to fix things upstream before we hack a solution into our package.

@chufz
Copy link
Collaborator Author

chufz commented Sep 26, 2022

michalsta/opentims#16

@RogerGinBer
Copy link

Hi there! @chufz @jorainer
Today I was looking again into this open issue and I think we should be able to extract the information provided in the PasefFrameMsMsInfo table that you can get from the OpenTIMS object and fill it into the corresponding spectraData rows.

Here's a small example using the ddaPASEF.d test data:

library(Spectra)
library(MsBackendTimsTof)

fl <- system.file("ddaPASEF.d", package = "MsBackendTimsTof")
ot <- opentimsr::OpenTIMS(fl)
tbl <- opentimsr::table2df(ot, "PASEFFrameMsMsInfo")[[1]]
tbl

be <- backendInitialize(MsBackendTimsTof(), fl)
spd <- spectraData(be)
target_cols <- c("precursorMz", "isolationWindowTargetMz", "isolationWindowLowerMz",
                 "isolationWindowUpperMz", "collisionEnergy")

for(i in seq(nrow(tbl))){
    row <- tbl[i, ]
    target_rows <- which(spd$frameId == row$Frame &
                         MsCoreUtils::between(spd$scanIndex,
                                              c(row$ScanNumBegin, row$ScanNumEnd)))
    spd[target_rows, target_cols] <- 
        t(replicate(length(target_rows), 
                    list(row$IsolationMz, #Probably not entirely correct, we should use the Precursors table
                         row$IsolationMz,
                         row$IsolationMz - row$IsolationWidth,
                         row$IsolationMz + row$IsolationWidth,
                         row$CollisionEnergy))
          )
}
spd[5452:5462,] #Now it has the precursorMz, windows and energy

It's just a sketch, but I think we could implement something like this to incorporate these values into spectraData: what do you think? How should we go about it? 👍

@jorainer
Copy link
Member

Hey @RogerGinBer , I think it would be great if you could implement something like this. The question is where and how would it most make sense. One possibility would be to extract these data already during backendInitialize, another to extract them on-the-fly whenever requested (e.g. for spectraData or precursorMz etc).

I would suggest to implement a simple function that takes whatever data required (as low level as possible) and returns a data.frame with the additional information. PR welcome :)

@jorainer
Copy link
Member

Hm, maybe I need then to change the backend to extend the MsBackendCached to allow caching of some values within the backend instead of always read everything from the raw data files...

@RogerGinBer
Copy link

Sounds good, I'll create this function and add it to the .spectra_data internal function, as well as creating the methods for the individual variables (precursorMz, isolationWindowTargetMz, etc) 👍
Regarding caching, perhaps we should run some performance tests first to see whether it's actually worth it or not (balance speed vs complexity)

@chufz
Copy link
Collaborator Author

chufz commented Apr 21, 2024

Hi, this is a while ago, but are the changes implemented yet? So far I have worked with Opentimsr directly, but would be nice for some coding to use the MsBackendTimsTof for some implementations.

@RogerGinBer
Copy link

Hi @chufz, last year I started working on this but at some point I became busy with other work and left the PR unfinished.
If I'm not mistaken, most of the implementation was already done (at least for PASEF DDA data), so let me see if I can close that PR soon enough 👍

@chufz
Copy link
Collaborator Author

chufz commented May 17, 2024

So i tried the code from your repository, while the testdata works fine, for real-world data from my instrument, I reveal the following error: Error in reducer$value.cache[[as.character(idx)]] <- values : wrong args for environment subassignment In addition: Warning message: In parallel::mccollect(wait = FALSE, timeout = 1) : 1 parallel job did not deliver a result

@RogerGinBer
Copy link

That's interesting, looks like something went quite wrong there and the exception wasn't well captured, so the error message is rather generic

@chufz, could you provide an example of what you did there? What function did output this error? Was it when initializing the MsBackendTimsTof object or when accessing all/some MS/MS info?

If it's possible for you, could you share the data or a subset of it with me (over the metaRbolomics Slack or any other medium)? I'd be happy to take a look and see what the problem might be

@chufz
Copy link
Collaborator Author

chufz commented May 17, 2024

Just simply

sps <- Spectra(file, source = MsBackendTimsTof())

I will upload the data to a cloud server and send you the link over slack:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants