-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
On an HPC, saving to zarr is much faster than saving to binary #3630
Comments
Do you see utilization of the cores change depending on n_jobs on Eddie? Like an htop or something? |
The writing is 6x if I am reading correctly. My guess is that the writting on a NFS drive is slow and then having a less to write could explain the diff. Could you make some basic test of writting speed for this drive ? Also what is the ratio between the 2 folders ? Maybe locking to write in the same file in binary case cost a lot for this filesystem. |
The I can track the
meaning that in 10mins of human time, we used 76 mins of CPU time, which makes sense for a 8x parallelised piece of code. From now on, the code is saving the binary file. About ten minutes later, we have
Meaning that only one cpu has been used in the last 10 mins. So seems to be a parallelisation thing, for sure! Very confusing. |
Is it possible it's not the last ten minutes but the first?
We create the empty file first in order to write it for binary. And this is not using the multiprocessing. Just a stab in the dark. Do you want to time how long it takes to run just this code on your HPC to see if this might contribute? spikeinterface/src/spikeinterface/core/recording_tools.py Lines 137 to 142 in 8aeaf9b
|
@zm711 : really good point. |
Hello,
I think that saving to binary isn't parralellising when I run it on the Uni of Edinburgh HPC, Eddie. Here's the code:
This takes in an ~hour long recording, and then saves it twice. The output is:
So write_binary_recording takes 8x as long so it looks like it's not been parallelised. But weirdly, the code reports that it's using
n_jobs=8
for both cases. Passingn_jobs
directly tosave_to_folder
doesn't seem to help.On my personal computer, there is no real difference in the timings between these methods.
Any ideas? Are the parallelisation scehems difference for the different file formats?
The text was updated successfully, but these errors were encountered: