Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors when converting NWFSC Hake Survey Data from 2017 and 2021 #1374

Open
ctuguinay opened this issue Aug 14, 2024 · 3 comments
Open

Errors when converting NWFSC Hake Survey Data from 2017 and 2021 #1374

ctuguinay opened this issue Aug 14, 2024 · 3 comments
Assignees
Labels
bug Something isn't working data conversion
Milestone

Comments

@ctuguinay
Copy link
Collaborator

ctuguinay commented Aug 14, 2024

Some errors I found while converting all of 2017 and 2021 files using the latest Echopype main branch:

Error converting noaa-wcsd-pds/data/raw/Bell_M._Shimada/SH1707/EK60/Summer2017-D20170912-T194552.raw with Exception: Short read while getting trailing raw file datagram size for check 4 != 0 @ (19315440L, 32080)
Error converting noaa-wcsd-pds/data/raw/Bell_M._Shimada/SH1707/EK60/Summer2017-D20170718-T121359.raw with Exception: Short read while getting trailing raw file datagram size for check 4 != 0 @ (1311056L, 540)
Error converting noaa-wcsd-pds/data/raw/Bell_M._Shimada/SH1707/EK60/Summer2017-D20170819-T060438.raw with Exception: Short read while getting dgram size 4 != 0 @ (1449574268L, 1458)
Error converting noaa-wcsd-pds/data/raw/Bell_M._Shimada/SH1707/EK60/Summer2017-D20170807-T171736.raw with Exception: The DType <class 'numpy.dtypes.DateTime64DType'> could not be promoted by <class 'numpy.dtypes.Float64DType'>. This means that no common DType exists for the given inputs. For example they cannot be stored in a single array unless the dtype is `object`. The full list of DTypes is: (<class 'numpy.dtypes.DateTime64DType'>, <class 'numpy.dtypes.Float64DType'>)
Error converting noaa-wcsd-pds/data/raw/Bell_M._Shimada/SH2106/EK80/Hake-D20210913-T130612.raw with Exception: cannot reindex or align along dimension 'ping_time' because the (pandas) index has duplicate values
Error converting noaa-wcsd-pds/data/raw/Bell_M._Shimada/SH2106/EK80/Hake-D20210913-T225435.raw with Exception: cannot reindex or align along dimension 'ping_time' because the (pandas) index has duplicate values
@ctuguinay ctuguinay added bug Something isn't working data conversion labels Aug 14, 2024
@ctuguinay ctuguinay added this to the v0.9.1 milestone Aug 14, 2024
@ctuguinay ctuguinay self-assigned this Aug 14, 2024
@github-project-automation github-project-automation bot moved this to Todo in Echopype Aug 14, 2024
@ctuguinay
Copy link
Collaborator Author

The short read error is probably the EK software abruptly ending the file write, but perhaps there's a way to retrieve the data that has been written properly instead of just losing it completely to this error.

@ctuguinay
Copy link
Collaborator Author

cannot reindex error stems from the fact that the ping times have duplicate values in them:

image

I think a simple drop_duplicates in the set groups stage would fix this problem: https://docs.xarray.dev/en/stable/generated/xarray.Dataset.drop_duplicates.html.

@ctuguinay
Copy link
Collaborator Author

ctuguinay commented Aug 14, 2024

For the The DType <class 'numpy.dtypes.DateTime64DType'> could not be promoted by <class 'numpy.dtypes.Float64DType'>. This means that no common DType exists for the given inputs. For example they cannot be stored in a single array unless the dtype is object. The full list of DTypes is: (<class 'numpy.dtypes.DateTime64DType'>, <class 'numpy.dtypes.Float64DType'>) exception, we have the following:

image

where we are missing channel-specific environmental variable information.

The set groups then errors out here:

image

because the last ds_env is empty:

image

Broadcasting can be done onto the empty dataset in ds_env that can allow it to be merged:

image

Edit: Another simpler way to solve this is to remove sorted channels where the parser power is empty.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working data conversion
Projects
Status: Todo
Development

No branches or pull requests

1 participant