Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a way to validate output histograms against a trusted reference #136

Closed
eguiraud opened this issue May 16, 2023 · 1 comment · Fixed by #149
Closed

Add a way to validate output histograms against a trusted reference #136

eguiraud opened this issue May 16, 2023 · 1 comment · Fixed by #149
Assignees

Comments

@eguiraud
Copy link
Contributor

In first approximation it would be enough to publish reference histogram content, in whatever easily readable format (even JSON, or serialized dictionaries of numpy arrays).

A simple CLI tool or similar that runs the comparison between two such JSON files could be provided on top.

Complications:

  • one systematic variation depends on RNG numbers, so bin values won't be stable (especially with multi-thread/out-of-order execution): I think it's ok to clearly mark it as such and only expect a very approximate equality there
  • bin values depend on how many files per process are used: I think we should provide values for at least 1 file per process (for quick validation) and for the full dataset (for full validation)
@alexander-held
Copy link
Member

As different implementations may run into some small discrepancies, I want to link #163 (comment) here for future reference. It can happen that events migrate across bin edges, which (depending on the event weight) can result in a somewhat large absolute change in event yields. I don't think there is anything we can do about this and at the level we observed this is not a concern for physics anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants