-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Dask loading of files to Live Viewer backend #2312
base: main
Are you sure you want to change the base?
Conversation
519228b
to
77fce5c
Compare
7e02ef9
to
2f11d90
Compare
6b17527
to
697db7e
Compare
5a625a8
to
e115d8c
Compare
Some Benchmarks: Running With the Delayed Stack not being created with I will also check the timings when simulating live data but it would be useful to append to the existing Delayed Stack rather than creating and replacing already created |
Using the code
We get: |
Ive benchmarked with the smaller and larger datasets and attempted to rechunk the Dask Array away from its default, e.g. chunksize = (1, 512, 512) for a (512, 512) dataset. Setting dask.array.rechunk('auto') makes things slower due to the way it accesses the chunks when we access each data slice to compute the mean. |
Issue
Closes #2311
Description
Dask is now used to load in the files in the Live Viewer path and display them as normal. Dask allows us to have a delayed array of all image data in the directory but without loading all of the data into memory. In order to display the images in the Live Viewer, the delayed array pointing to the image data is "computed" as needed but not stored permanently into memory.
This allows us to perform operations on the live data which would require the whole imagestack (mean, spectrum, etc), but without loading and storing the whole stack into memory at once. This PR acts as a proof of principle of the usefulness of Dask in Mantid Imaging, and gives a foundation of the structures needed to make Dask work.
Compatibility has been added for both
.tif
and.fits
files but they are dealt with separately as.fits
files are not natively supported by Dask and therefore the delayed arrays and computations have been done manually.Testing
make check
Acceptance Criteria
Open MI, open the Live Viewer and point to a folder with data, e.g.
python -m mantidimaging -lv="C:\Users\ddb29996\Documents\MantidImaging Data\Large Dataset\Flower_WhiteBeam\Tomo"
It would be preferable to do this with a larger dataset to easily see the benefit of using Dask.
Check that the images load as normal and you can move between frames with no errors or appreciable slowdown.
Perform an "Operation" on the whole imagestack. While we do not currently implement these kinds of operations in the Live Viewer yet, you can paste the following code into line 346 of
mantidimaging/gui/windows/live_viewer/model.py
:This will take the delayed imagestack and calculate a form of spectrum of all images in the Live Viewer folder.
As you open and initialise the Live Viewer, keep an eye on your RAM usage and check that the RAM usage does not increase by the size of the imagestack (this is easier to see with the Flower_WhiteBeam dataset as it is around 9GB).
Check that this calculated spectrum is what you would expect for the dataset, for example, for the Flower_Whitebeam data, you should get this:
For the
MantidImaging Data\Brass\Corrected_Sample_PH20
data, you should get:Repeat this process with both .tif and .fits datasets to make sure both are functional.
As the nature of how some of the Live Viewer data structures and flows work has been changed, the Live Viewer tests have been altered to reflect this.
Documentation
Will add release note