You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dear all,
while I currently have other priorities in my life than R3BRoot, I could not help but overhear a few of my colleagues working with TofD Cal2Hit and I have a few suggestions to make.
Background: TofD is one of the detectors which is harder to calibrate than some others. While the gamma calibration of CALIFA will fit comfortably on a beer mat, the TofD calibration involves a lot of different stages. This is just a fact of life, detectors sometimes are complex and sometimes easy to calibrate.
R3BRoot code to obtain calibration parameters is -- in my experience -- not remotely as well structured as it should be.
For CALIFA, this is not a big problem because the PhD student who has to calibrate it can either figure out to run it (it is a single stage process) or just rewrite it from the scratch -- find two peaks, fit two gaussians, write a linear interpolation (been there, done that).
For TofD, there is the fine time calibration, the time offset calibration (with or without LOS), determining v_eff (to get the position from time), the ToT calibration, determining the attenuation (to get the position from ToT) and finally doing the smiley correction to get the charge calibration correct. (Or so I figure, I am not a TofD expert.) In the best case, this will require four passes and three different runs, I guess.
I suggest that the collaboration decides to rewrite the code to obtain the calibration for TofD (details below). This has the potential to halve the time a new PhD student requires to obtain a calibration.
(From my understanding, both NeuLAND and TofD use TAMEX cards and their fine time calibration is equivalent, so I am puzzled why we would have two Tasks to read them from UCESB and two Tasks apply the fine time calibration, instead of just single classes TAMEXSource and TAMEXMapped2Cal, but that is not my main point, which is the Cal2Hit.)
My proposal is the following:
The class R3BTofDHitData is expanded. It shall contain
All the data used for physics analysis
All the intermediate steps calculated during calibration, clearly labeled thus
One class (or a python script working on the TTree) will generate all the plots required to obtain a calibration1. Starting from the raw time difference between top and bottom PMT, going on to the compensated time difference (where the y=0 sweep should have zero), and the position calculated on the effective speed of light. Same for all the steps in the ToT pos.
To obtain the calibration, start with a calibration parameter set where all the parameters are set to NaN (Not A Number). This means that all the plots which actually use parameters will be empty.
A pyROOT script (or macro if you really must) shall read these histograms produced with a y=0 sweep and obtain the relevant calibration data from it (such as the top-bottom time offset), encrypting them to FairParams.
After the first stage of parameters have been obtained, they can be applied to a suitable run (perhaps y-offset), resulting in more histograms being filled.
The next pyRoot script shall read in these histograms, read in the previously obtained parameters, calculate the next parameters (v_eff) and write the parameters obtained that way out again.
As an alternative to one big piece of code generating all the histograms, suitable histograms could also be generated by various pyROOT scripts from the tree before the fitting stages where they are required instead of all at one place. Advantage: Large hist-filling classes are commonly utterly unreadable. Disadvantages: more wrapper code duplication (could be outsourced to another file, though), no access to histograms of previous stages for the run of the current stage.
Any replacement should accomplish at least the following:
Separation of histogramming and fitting. I have said that before, but let me reiterate: creating histograms costs CPU time, fitting (if done responsibly, i.e. checking chi-squared) costs brain time. If you put both in the same task, every change to the fitting will result in the computer re-histogramming everything, during which time the user will be idle.
No code duplication for the calibration application itself. Currently, the calibration is implemented twice. Once in Cal2Hit, once in Cal2HitPar. This is a lengthy procedure. If there are any differences in the implementation, the user will have much fun figuring them out, not. Instead, a partial application of the calibration should be used to obtain further calibration parameters.
Separation of the different parts of the calibration. From my understanding, @michael-heil was originally told by the maintainers that the calibration parameter finding should be accomplished by a single FairTask, and thus he introduced fParameter to track the stage the calibration is in. If instead we split the fitting into a lot of small scripts, this will result in much more readable code.
Outcome compatibility to Michael's version. He is our detector expert. If you want to do stuff different from him, at least get his blessings.
No stage-dependent interpretation of histograms. Setting the time offset calibration parameters to zero to figure out what they should be feels like a neat hack, but will be confusing. (Part of the problem is that generating a lot of histograms in ROOT is no fun, see 1, so people are tempted to reuse the histograms they already have.)
A calibration example. "Here is my y=0 sweep, my offset sweep and the meander run (at cal level as a tree -- most people don't want to run unpackers again). Here are the command lines I executed, in order. Here is a run with a target in, this is how the different charges look."
Ideally, the various scripts should be written in a way that they don't have to be modified. Instead, the stuff which is likely to be modified (Beam charge, file names for the various runs used for calibration, fit ranges, etc) should be set in some configuration file (e.g. yaml or something else which is not Turing-complete!). This means that at the end of the day, the user can just run git diff and the changes in the scripts will mostly be improvements which could be turned into a pull request. Of course, like any code doing heavy lifting, these scripts would belong into the main R3BRoot repository.
CCFPED (Ceterum censeo FairParam esse delendam).
Footnotes
Filling a large number of different-quantity histograms commonly messy. In place A, you declare all the pointers, in place B, you instantiate all the THxY classes in a mess of for loops and Format() macros, in place C, you actually fill data in the histograms, in place D, you write them all out to file and (optionally, for all I care) in place E, you delete them again. I once had a proof of concept code where you instantiated one wrapper class per histogram type, specifying what quantities you want to plot (via one to three variadic lambdas) and in what ranges. Some clever code would figure out over which sets of data to iterate based on the types of Data pointers used in the lambda expressions for the various axis. And you could specify based on what values stacks of histograms should be created. As all the lambdas and ranges were passed as template parameters, they could be inlined. I think I accepted the cost of one virtual call per histogram type and event and put the histogram types in a vector rather than a stack of types. 2 Most fun I had with C++. Of course, after Remake of Califa Clustering #771, I saw that I had irreconcilable differences with the R3BRoot maintainers and transitioned to the main DAQ. ↩↩2
There would still be a cost for looping through the histograms once per histogram-type, and having to calculate quantities (such as WRTSCalifa-WRTSMain repeatedly). Of course, runtime per event is rather irrelevant, in any case. If a new PhD student has to calibrate the TofD, the computation costs are a rounding error compared to their salary. ↩
The text was updated successfully, but these errors were encountered:
Dear all,
while I currently have other priorities in my life than R3BRoot, I could not help but overhear a few of my colleagues working with TofD Cal2Hit and I have a few suggestions to make.
Background: TofD is one of the detectors which is harder to calibrate than some others. While the gamma calibration of CALIFA will fit comfortably on a beer mat, the TofD calibration involves a lot of different stages. This is just a fact of life, detectors sometimes are complex and sometimes easy to calibrate.
R3BRoot code to obtain calibration parameters is -- in my experience -- not remotely as well structured as it should be.
For CALIFA, this is not a big problem because the PhD student who has to calibrate it can either figure out to run it (it is a single stage process) or just rewrite it from the scratch -- find two peaks, fit two gaussians, write a linear interpolation (been there, done that).
For TofD, there is the fine time calibration, the time offset calibration (with or without LOS), determining v_eff (to get the position from time), the ToT calibration, determining the attenuation (to get the position from ToT) and finally doing the smiley correction to get the charge calibration correct. (Or so I figure, I am not a TofD expert.) In the best case, this will require four passes and three different runs, I guess.
I suggest that the collaboration decides to rewrite the code to obtain the calibration for TofD (details below). This has the potential to halve the time a new PhD student requires to obtain a calibration.
(From my understanding, both NeuLAND and TofD use TAMEX cards and their fine time calibration is equivalent, so I am puzzled why we would have two Tasks to read them from UCESB and two Tasks apply the fine time calibration, instead of just single classes TAMEXSource and TAMEXMapped2Cal, but that is not my main point, which is the Cal2Hit.)
My proposal is the following:
As an alternative to one big piece of code generating all the histograms, suitable histograms could also be generated by various pyROOT scripts from the tree before the fitting stages where they are required instead of all at one place. Advantage: Large hist-filling classes are commonly utterly unreadable. Disadvantages: more wrapper code duplication (could be outsourced to another file, though), no access to histograms of previous stages for the run of the current stage.
Any replacement should accomplish at least the following:
CCFPED (Ceterum censeo FairParam esse delendam).
Footnotes
Filling a large number of different-quantity histograms commonly messy. In place A, you declare all the pointers, in place B, you instantiate all the THxY classes in a mess of for loops and Format() macros, in place C, you actually fill data in the histograms, in place D, you write them all out to file and (optionally, for all I care) in place E, you delete them again. I once had a proof of concept code where you instantiated one wrapper class per histogram type, specifying what quantities you want to plot (via one to three variadic lambdas) and in what ranges. Some clever code would figure out over which sets of data to iterate based on the types of Data pointers used in the lambda expressions for the various axis. And you could specify based on what values stacks of histograms should be created. As all the lambdas and ranges were passed as template parameters, they could be inlined. I think I accepted the cost of one virtual call per histogram type and event and put the histogram types in a vector rather than a stack of types. 2 Most fun I had with C++. Of course, after Remake of Califa Clustering #771, I saw that I had irreconcilable differences with the R3BRoot maintainers and transitioned to the main DAQ. ↩ ↩2
There would still be a cost for looping through the histograms once per histogram-type, and having to calculate quantities (such as WRTSCalifa-WRTSMain repeatedly). Of course, runtime per event is rather irrelevant, in any case. If a new PhD student has to calibrate the TofD, the computation costs are a rounding error compared to their salary. ↩
The text was updated successfully, but these errors were encountered: