Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using vessel log data in downstream dataset #1344

Open
leewujung opened this issue Jun 19, 2024 · 7 comments
Open

Using vessel log data in downstream dataset #1344

leewujung opened this issue Jun 19, 2024 · 7 comments

Comments

@leewujung
Copy link
Member

leewujung commented Jun 19, 2024

With #1318 , the vessel log data would be stored in the Vendor_specific group. However right now these data is not propagate down to the Sv and MVBS datasets if we need to use it.

In theory one can just add the vessel log distance/lat/lon in when needed, directly from the EchoData object, but that does require very good provenance tracking so one can recover the EchoData object used for calibration.

I wonder if it would make sense to add an optional argument include_idx_data=True/False in compute_Sv so that the these variables can be added to the Sv datasets at the last step before exiting.

The same thing can be done with MVBS, but for that perhaps we just propagate whatever is in the Sv dataset by default, except for the variables that are bin-averaged.

@ctuguinay : thoughts?

@ctuguinay
Copy link
Collaborator

ctuguinay commented Jun 19, 2024

At least for the compute_Sv side of things, I think this should also be its separate function just like how ep.consolidate.add_depth and ep.consolidate.add_location is its own. There's this messiness with the interpolation when the time3 dim of the IDX variables doesn't have a 1-1 matching with the ping_time of the Sv dataset.

@ctuguinay
Copy link
Collaborator

And once it's in Sv, are you thinking of propagating it downwards to MVBS by also bin averaging it (to match the new ping time and depth/echo range bins of the MVBS)?

@leewujung
Copy link
Member Author

Oh I see what you mean. I was actually thinking to not interpolate, and simply plug in the entire variables, because the lat/lon would be duplicated from what ep.consolidate.add_location would add already. And it is the same GPS.

Actually, maybe all we need is the distance, timestamp, and file_offset? I guess we should read the manual on that datagram association to see how the timing is associated along with the file_offset parameter...

For MVBS, lat/lon are already bin-average if they are present in the Sv dataset, so again perhaps all we need is the distance, timestamp, and file_offset?

@ctuguinay
Copy link
Collaborator

ctuguinay commented Jun 19, 2024

To retain the same dimensions in ds_Sv, I still think the vessel_distance would need to be interpolated upon if it is to match time-wise with ping_time, but I think ep.consolidate.add_location already does this interpolation when it doesn't match. All we would need to do is add a line to the interpolations already done:

interp_ds["latitude"] = sel_interp("latitude", time_dim_name)
interp_ds["longitude"] = sel_interp("longitude", time_dim_name)
interp_ds["vessel_distance"] = sel_interp("vessel_distance", time_dim_name) # new line

In the case where no interpolation is done and vessel_distance (and timestamp and file_offset) are added, this would add a 4th dimension time3 to the ds_Sv.

Although, since the variables themselves don't need to be stored with ping_time, range_sample, and channel, perhaps that is fine? I don't expect the entire dataset to expand that much without interpolation if adding them introduces a new dimension, since it won't affect the big data variables like Sv and echo_range and depth.

@leewujung
Copy link
Member Author

Ah I guess I didn't explain what I thought the next step would be. Since the way we use vessel log is to use the distance as markers, I was thinking that we can find the closest distance and use that corresponding timestamp to slice the Sv or MVBS data based on ping_time. This is why I think we should look into what the manual is saying about the file_offset, since that is related to this association. Since Simrad is very specific about this association, I think we should understand that and then decide if slicing vs interpolation would be a better solution.

@ctuguinay
Copy link
Collaborator

Ah, I see now. I'll get to reading 🫡

@leewujung
Copy link
Member Author

me too!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Todo
Development

No branches or pull requests

2 participants