Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PSA: Califa WRTS, Overflows #705

Open
klenze opened this issue Nov 28, 2022 · 13 comments
Open

PSA: Califa WRTS, Overflows #705

klenze opened this issue Nov 28, 2022 · 13 comments
Labels

Comments

@klenze
Copy link
Contributor

klenze commented Nov 28, 2022

Dear all,
it has come to my attention that the old unpacking code (FEBEX3_CALIFA_BASE() in the spec file) does not only not encode the individual hits WRTS, but also does not include the overflow flags. This will mean that overflows will not get invalidated in R3BCalifaMapped2CrystalCal (to ieee 754 NAN, possibly INFINITY in the future) but erroneously get propagated as valid data.

The new code can be detected in /u/land/fake_cvmfs/9.13/upexps:

$  grep ../califa/califa.spec */*.spec   | sed s@/.*@@ | uniq
202202_para
202204_tests
202205_s509
202205_s522
califa_only

The old version can be detected using:

$ grep ^FEBEX3_CALIFA_BASE'()' */*.spec  | sed s@/.*@@ | uniq
201810_s444
201902_s444
201902_s473
201904_s454
201911_eng
201911_eng2
202002_s444
202002_s467
202004_s444
202006_s444
202006_test
202103_s455
202104_s515
202105_s494
califa2021 # false positive
califaKrakow17
califalisbon16

To get the overflow flags, we should convert the unpackers to use the new ../califa/califa.spec instead. This will require new mapping files for all of them. (Technically, we could also fix the old version, but even that will require adding CALIFA_OV_%d to the mapping_CALIFA.hh)

Mapping files compatible with califa.spec can be generated from califa_mapping.txt by copying/cloning /u/land/klenze/r3bmap/ (or landgw01:git/r3bmap) and running the updated ./run/califa/mapping/ucesb_mapping_gen_CALIFA_TOT.py.

This does require the original califa_mapping.txt, which I do not generally have. (They are different for every experiment as they depend on the presumed gain settings.)

The replacement in the spec file should look like in s522:

...
#define CALIFA_DEFAULT_MAPPING 0
#include "../califa/califa.spec"
#include "mapping_CALIFA_s522.hh" // <--- this has to be generated per experiment
...
EVENT
{
...
        revisit califa_messel    = CALIFAM(type = 100, subtype = 10000, procid = 13, control = 90); // keep settings from old version
        revisit califa_wixhausen = CALIFAW(type = 100, subtype = 10000, procid = 13, control = 91); // same
...
}
...

For a version of the unpacker using califa.spec for s467, see /u/land/latar/upexps/202002_s467.

@inkdot7: Is there a way to query the h101 magic if a field which is supposed to be there was found in the data steam? In that case the FebexReader class could just refuse to work (unless manually overwritten) when the overflow field is missing).

@bl0x: Do you have any objections to updating the s467 unpacker to Leylas version?

@ryotani @jose-luis-rs @hapol @gabrigarjim @ajedele: Please propagate the information to your colleagues working with CALIFA data and coordinate on how to fix the unpackers which are still used for analysis.

Credits to Luke, Leyla and Tobias for helping with finding the issue and/or updating the unpacker.

Thanks,
Philipp

@inkdot7
Copy link
Contributor

inkdot7 commented Nov 28, 2022

Is there a way to query the h101 magic if a field which is supposed to be there was found in the data steam? In that case the FebexReader class could just refuse to work (unless manually overwritten) when the overflow field is missing).

That should be doable with ext_data_struct_info_map_success():

https://git.chalmers.se/expsubphys/ucesb/-/blob/master/hbook/ext_data_client.h#L233

Also note the function ext_data_struct_info_print_map_success() below, https://git.chalmers.se/expsubphys/ucesb/-/blob/master/hbook/ext_data_client.h#L262 which may be handy to use first to see if this at all seems to work.

@klenze
Copy link
Contributor Author

klenze commented Nov 29, 2022

@inkdot7, thanks. I first had some null pointer dereferences when trying to run this function immediately after running EXT_STR_h101_CALIFA_ITEMS_INFO, but eventually figured out that it has to be run later. Now I am checking in the first event if the relevant field was found and assert(0) otherwise.

For future generations, let me also state that I vehemently disagree with this issue being described as an enhancement. Using a different color palette is an enhancement. Not treating invalid data as valid data is not an optional nice-to-have thing, it should be rather essential for anyone who gives a damn about the validity of their results.

Otherwise, I am done with this bug. As of 2022-05, I am not employed by R3B, so I consider any effort I put into PRs or R3BRoot-related talks instead of my thesis to be charity work. And even for obvious charity cases like this, I will limit my involvement.

Feel free to close this bug if the PR is merged and any unpackers still in use are fixed, or leave it open and let the tumbleweeds gather for years.

I wash my hands of this. Merge the PR or not as thou wilt.

@inkdot7
Copy link
Contributor

inkdot7 commented Nov 30, 2022

Sorry about the NULL pointer access in ext_data_struct_info_map_success(). This was presumably a use before ext_data_setup() which should now be caught with a return -1 and EINVAL instead.

However, it is strange that this issue was not caught by the check after ext_data_setup() in

#ifdef EXT_DATA_ITEM_MAP_MATCH
/*
* It is not needed, that *all* items are matched.
* However, mapping should fail, if items are requested that don't exist
* on the server, or if items are requested with wrong parameters.
* See ucesb/hbook/ext_data_client.h for more information.
*/
uint32_t map_ok = EXT_DATA_ITEM_MAP_OK | EXT_DATA_ITEM_MAP_NO_DEST;
if (struct_map_success & ~(map_ok))
{
R3BLOG(WARNING, "ext_data_clnt::setup() failed");
ext_data_struct_info_print_map_success(fStructInfo, stderr, map_ok);
return kFALSE;
}
#endif

Due to the #ifdef EXT_DATA_ITEM_MAP_MATCH in this region of the code? Please remove all those #ifdef, the struct_map_success return value has been available in UCESB since May 2020, it is years overdue.

@jose-luis-rs Please reclassify as a bug. Data loss is a bug. That the defensive code in R3BUcesbSource.cxx has not caught it make it a critical bug.

@jose-luis-rs
Copy link
Contributor

@inkdot7 Done!

@klenze
Copy link
Contributor Author

klenze commented Dec 1, 2022

However, it is strange that this issue was not caught by the check after ext_data_setup() in

I just checked, the if is executed and the relevant items are listed as (0x04) not found.

The problem is more that this is one out of three warning messages (the others are irrelevant for me). That plus 3 error messages, 25 lines of info messages and another 25 of untagged lines means that it can be kind of lost in the noise.

And some items being missing are in fact totally okay. If the unpacker does not provide per hit califa (interpolated) wrts, one can still do a good analysis (unless beam conditions were challenging). The problem is that someone who is not a califa expert will not know that one of the lines in

CALIFA_WRTS_T1                           312044   (0x04) not found
CALIFA_OV                                273128   (0x04) not found

is harmless and the other one will make the QPID spectra look horrible.

@inkdot7
Copy link
Contributor

inkdot7 commented Dec 1, 2022

I just checked, the if is executed and the relevant items are listed as (0x04) not found.

The problem is more that this is one out of three warning messages (the others are irrelevant for me). That plus 3 error messages, 25 lines of info messages and another 25 of untagged lines means that it can be kind of lost in the noise.

What then is surprising is that the program at all continues after the return kFALSE;... Is the return from InitUnpackers() ignored?

Where is that called? I find no call to InitUnpackers() anywhere under R3BRoot/. Is that in some copy-pasted boilerplate?

Then either some nasty (uncatchable) exception need to be thrown, or just exit(1) comes to mind.

@klenze
Copy link
Contributor Author

klenze commented Dec 1, 2022

From gdb, it looks likt it is called here: https://github.com/FairRootGroup/FairRoot/blob/dev/online/steer/FairRunOnline.cxx#L192

(Don't ask me what onlineness has to do with it.)

My point was that different fields have different urgencies. Sometimes it is useful to be able to add new unpacker fields to the reader without breaking old unpackers, so dying if any mismatches are found is probably not feasible. (Also, array sizes differ between versions of the califa unpackers for some reason, I think.)

For CALIFA_OV, no data is not acceptable, so we should halt and catch fire in that case.

@inkdot7
Copy link
Contributor

inkdot7 commented Dec 1, 2022

The EXT_DATA_ITEM_MAP_NO_DEST flag for map_ok tells that it is ok to have additional fields delivered from the unpacker, which are not requested at all.

Requested fields should be mandatory, unless those that are explicitly considered optional.
One way of doing that would be for various detectors to add 'optional' variable names to some list. Then iterate with ext_data_struct_info_map_success() e.g. as in #709 and do the halt and catch fire for any unexpected mapping failure.

Until fairroot is repaired (see issue just above), I would suggest to use a firm exit() approach.

@inkdot7
Copy link
Contributor

inkdot7 commented Dec 12, 2022

As I am not using or running r3broot myself, could someone please check PR FairRootGroup/FairRoot#1307 which is a proposed solution to the ignored return issue FairRootGroup/FairRoot#1301 .

@klenze
Copy link
Contributor Author

klenze commented Feb 12, 2023

Fixed the califa specific issue in the R3Brc fork by cherry-picking #709.

Closing this issue here as WONTFIX. :-(

@klenze
Copy link
Contributor Author

klenze commented Feb 13, 2023

@jose-luis-rs: I closed the issue because from the lack of maintainer response, new mapping generated for the old unpackers and the refusal to even merge #709, which would at least make sure that users are aware of it I got the impression that "filtering out overflows during clustering" was the preferred method to work around this issue.

@klenze klenze reopened this Feb 13, 2023
@ryotani
Copy link
Contributor

ryotani commented Feb 13, 2023

@klenze Sorry for not explicitly replying to this issue, but as you may be aware at least the unpacker of 202002_s467 is (at least I recognise) working well to treat OV hits in CALIFA. I appreciate your support.

@jose-luis-rs
Copy link
Contributor

@klenze: Sorry, but I think that maintainers have more responsibilities, projects, teaching, ..., so please be patient while we finalize the tasks. Thanks in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants