Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potentially fixing bug in triggerTimeAdjuster affecting the trace_start_time #704

Closed
wants to merge 2 commits into from

Conversation

fschlueter
Copy link
Collaborator

Also slightly refactoring the module

I have not tested it yet but just by looking at it, it feels odd to set the trace_start_time to the trigger time when the purpose of the module is to place the trigger somewhere within the trace.

…rt_time. Also slightly refactoring the module
trace = trace[cut_samples_beginning:(number_of_samples + cut_samples_beginning)]
channel.set_trace(trace, channel.get_sampling_rate())
channel.set_trace_start_time(trigger_time)
channel.set_trace_start_time(trigger_time - pre_trigger_time)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that should be the relevant change!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first glance this looks like the right thing to do here, yes
... but it doesn't solve our main issue with the RNO-G sims, where the pulses triggered on may be jumping around in the clipped traces. The channel.set_trace is still writing the identical clipped trace

raise KeyError

pre_trigger_time = self.__pre_trigger_time[trigger_name][channel_id]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this refactor will fail... pre_trigger_time can be a dict of "trigger names" but it also can be a dict of channel numbers... i.e. not a nested dict with [trigger_name][channel_id]

I just would not improvify this block

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

autsch

okay

Copy link
Collaborator

@shallmann shallmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now

Copy link
Collaborator

@sjoerd-bouma sjoerd-bouma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See discussion in comment and in #526 previously. Also @cg-laser may want to weigh in.

Comment on lines -180 to +200
channel.set_trace_start_time(channel.get_trace_start_time()-pre_trigger_time[channel.get_id()])
channel.set_trace_start_time(channel.get_trace_start_time() - pre_trigger_time[channel.get_id()])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also the discussion in #526 - @cg-laser wanted the traces from simulation to all start at the same time, and store the pre_trigger_times in the Trigger object, so in an analysis one has to run the triggerTimeAdjuster with mode="data_to_sim" to get the correct (adjusted) trace start times, analogously to how the cableDelayAdder works.

I actually argued at the time for your implementation, but had no supporters - happy to reopen the argument, but if we stick with your implementation we should remove the "data_to_sim" option.

@fschlueter
Copy link
Collaborator Author

Hi @sjoerd-bouma, thanks for bringing that up (and apologize not having waved in earlier). So let me first understand:
Currently (i.e. without this PR) the correct sequence would be running the module twice:

            # Adjust the trace length to a detector specific number of samples. Moving the trigger time to the desired position with t0. However, keeping the trace_start_time at t_trig.
            triggerTimeAdjuster.run(evt, station, det)
            # Applying the correct value to trace_start_time
            triggerTimeAdjuster.run(evt, station, det, mode="sim_to_data") 

Is that correct? I think @cg-laser summarised his intention with the following post:

It boils down to the question of how raw-data-like we want to be able to simulate. This approach implies that we can only simulate "calibrated data" where trigger time offsets were already adjusted for.
I think we should allow for the option to eventually simulate raw data. Then we need to store this information somewhere else and correct for it when reconstructing the data.

I would argue that one allows to "simulate raw data" by omitting this module (or modifying the config). After all this module already imprints an arbitrary detector feature with the number of samples.

If I understand the module correctly it leaves the simulation in an inconsistent (somewhat undefined state) which is dangerous. Needing to run the module twice to archive what is needed to simulate a normal detector is awkward and non-intuitive. I think your guideline should be to allow for a correct detector simulation by "default" and accommodate further use cases by reconfiguration (for experts).

But maybe I miss understood something?

@sjoerd-bouma
Copy link
Collaborator

Hi @sjoerd-bouma, thanks for bringing that up (and apologize not having waved in earlier). So let me first understand: Currently (i.e. without this PR) the correct sequence would be running the module twice:

            # Adjust the trace length to a detector specific number of samples. Moving the trigger time to the desired position with t0. However, keeping the trace_start_time at t_trig.
            triggerTimeAdjuster.run(evt, station, det)
            # Applying the correct value to trace_start_time
            triggerTimeAdjuster.run(evt, station, det, mode="sim_to_data") 

Is that correct? I think @cg-laser summarised his intention with the following post:

It boils down to the question of how raw-data-like we want to be able to simulate. This approach implies that we can only simulate "calibrated data" where trigger time offsets were already adjusted for.
I think we should allow for the option to eventually simulate raw data. Then we need to store this information somewhere else and correct for it when reconstructing the data.

I would argue that one allows to "simulate raw data" by omitting this module (or modifying the config). After all this module already imprints an arbitrary detector feature with the number of samples.

If I understand the module correctly it leaves the simulation in an inconsistent (somewhat undefined state) which is dangerous. Needing to run the module twice to archive what is needed to simulate a normal detector is awkward and non-intuitive. I think your guideline should be to allow for a correct detector simulation by "default" and accommodate further use cases by reconfiguration (for experts).

But maybe I miss understood something?

If you omit this module, traces will have arbitrary lengths, and all start at the same time - not exactly similar to raw data. The 'correct' usage right now would be

### SIMULATION
(...) # run the simulation, triggers, etc.
triggerTimeAdjuster.run(event, station, det, pre_trigger_time) # where pre_trigger_time contains the readout delay settings for each trigger you want to simulate

This results in traces with equal start times and the correct trace length, that reflect the data as would be recorded by a detector with the given cable delays and readout windows.

Then, in an analysis (e.g. reconstruction), you would use

### ANALYSIS
event = reader.get_event(...)
triggerTimeAdjuster.run(event, station, det, mode='data_to_sim')
cableDelayAdder.run(event, station, det, mode='subtract')

(...) # the analysis you want to run

This should work for both real and simulated data and adjust the trace_start_time appropriately.

For what it's worth, seeing as the readout delays are stored in the data anyway, I argued at the time that accounting for readout delays should just be done by the reader module. In that case you would go with your implementation and remove the need to run the triggerTimeAdjuster at the start of each analysis. But Christian objected to this (and no-one else voiced an opinion) so in the end we went with the current implementation.

@fschlueter
Copy link
Collaborator Author

Ah, with simulating "raw data" you are referring to simulate what a detector would actually record and write to drive? I agree that this is something we want! But, our detector records traces (for a given trigger) at different points in time, i.e., we have or can implement readout delays. I thought this module is also suppose to accommodate/simulate this behaviour?

But still the module right now does something wrong. It sets the trace start time to the trigger time while the trigger time is not at the beginning of the trace right?

@sjoerd-bouma
Copy link
Collaborator

Ah, with simulating "raw data" you are referring to simulate what a detector would actually record and write to drive? I agree that this is something we want! But, our detector records traces (for a given trigger) at different points in time, i.e., we have or can implement readout delays. I thought this module is also suppose to accommodate/simulate this behaviour?

But still the module right now does something wrong. It sets the trace start time to the trigger time while the trigger time is not at the beginning of the trace right?

I think you may be confusing two things here. First is what is actually contained in the stored trace. This is adjusted for by the triggerTimeAdjuster in the simulation correctly (I hope - I have run a simulation with it before that seemed to work as expected, but it's been a while). So if you have different readout windows for different channels, the cut traces in the event after running the triggerTimeAdjuster should correspond to the correct windows. (There are a couple of subtleties with respect to how this is implemented in the real RNO-G detector vs how it is in this module: firstly, in a real detector, you implement readout delays, but they are specified here for historic reasons as 'pre_trigger_times' instead. Secondly is the actual trigger delay, i.e. the delay between a trigger condition being satisfied and the trigger decision being registered by the DAQ, which is not implemented in NuRadioMC).

The second thing is the trace_start_time, i.e. the time at which the channel 'thinks' the trace was recorded. This is 'wrong' in the simulation on purpose; the idea is that 'raw data' is synchronous (all traces start at 0), and the true trace start times need to be corrected for readout windows and cable delays in analysis. But this is just bookkeeping - the voltage traces themselves are not rolled or adjusted in this step.

@fschlueter
Copy link
Collaborator Author

Okay after having a short call with Sjoerd I think I understand better now. The idea is that when we read out data which are recorded with readout delays we have to subtract those delays to be able to reconstruct those events. And the idea of setting the trace_start_time to the trigger time is that you would want to do the same thing for simulations. However, I agree with Sjoerd that ideally this correction on data is done in an RNO-G reader and hence there is no need to have this step performed for simulations or data.

However, if we want to keep it that way I am arguing that we should set the trace_start_time to None to avoid that people are missing out this step

@fschlueter
Copy link
Collaborator Author

The argument to correct for the different readout windows in an reader and also omitting this step explicitly for simulations is that the information of the readout windows is always there and (to first order) 100% correct

@sjoerd-bouma
Copy link
Collaborator

However, if we want to keep it that way I am arguing that we should set the trace_start_time to None to avoid that people are missing out this step

🔥
I like this suggestion! It will definitely break a lot of things, but I think that's the best way to make sure people change their ways.

@fschlueter
Copy link
Collaborator Author

Update: The current RNO-G reader is setting the trace_start_time according the trigger type. So it is already doing something (although we should review if it and extend it).

@cg-laser what is your take on my proposals?

@fschlueter
Copy link
Collaborator Author

@cg-laser can we conclude on this after yesterdays discussion?

@fschlueter
Copy link
Collaborator Author

This module is becoming deprecated with #763 and replaced by a module which sets the trace_start_time. Data io modules will need to set it as well.

@fschlueter fschlueter closed this Nov 26, 2024
@fschlueter fschlueter deleted the fix_triggerTimeAdjuster branch December 5, 2024 11:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants