Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about dataset/imgs.zarr. #8

Open
JoOkuma opened this issue Oct 19, 2021 · 1 comment
Open

Question about dataset/imgs.zarr. #8

JoOkuma opened this issue Oct 19, 2021 · 1 comment

Comments

@JoOkuma
Copy link

JoOkuma commented Oct 19, 2021

Hi,
Is there any restriction/standard for the datasetimgs.zarr` format (dtype, channel order, chunksize, etc.)?

I'm currently converting my data from zarr to the BigDataViewer format and I would like to skip the step of converting back to zarr for the machine learning and use my original data insted.

Does it work with any zarr array with axis T, Z, Y, X, and integer or floating values?

@ksugar
Copy link
Member

ksugar commented Oct 20, 2021

Yes, ELEPHANT expects img.zarr to have a specific channel order and shape (T, Z, Y, X), while its dtype and chunksize can be flexible.
https://github.com/elephant-track/elephant-server/blob/v0.2.0/elephant-core/elephant/tool/dataset.py
By default, ELPHANT creates imgs.zarr from .h5 with a dtype of uint8 or uint16 depending on the original data, and a shape of (T, Z, Y, X).
If you manually prepare imgs.zarr, please additionally prepare other .zarr files (see below) for ELEPHANT with the format specified in the table. The existances of these files and its dtype and shape are checked before each command.

dataset
    ├── flow_hashes.zarr
    ├── flow_labels.zarr
    ├── flow_outputs.zarr
    ├── imgs.zarr
    ├── seg_labels_vis.zarr
    ├── seg_labels.zarr
    └── seg_outputs.zarr
file dtype shape
flow_hashes.zarr S16 (T - 1,)
flow_labels.zarr f4 (T - 1, 4, Z, Y, X)
flow_outputs.zarr f2 (T - 1, 3, Z, Y, X)
seg_labels_vis.zarr u1 (T, Z, Y, X, 3)
seg_labels.zarr u1 (T, Z, Y, X)
seg_outputs.zarr f2 (T, Z, Y, X, 3)
Notes about uint8 or uint16

The BigDataViewer .h5 files store image data using uint16. If the maximum value in the image data is smaller than 256, we use uint8 to save the storage, otherwise we use uint16.
At runtime, image data stored in img.zarr is converted to float32 and normalized in the range [0, 1].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants