On an HPC, saving to zarr is much faster than saving to binary #3630

chrishalcrow · 2025-01-20T16:43:37Z

Hello,

I think that saving to binary isn't parralellising when I run it on the Uni of Edinburgh HPC, Eddie. Here's the code:

import spikeinterface.full as si

# there's some code to get the recording paths, and `get_recording_from` returns a openephys recording object
recording = si.concatenate_recordings([get_recording_from(rec_path) for rec_path in rec_paths])
recording_filtered = si.common_reference(si.bandpass_filter(recording, freq_min=300, freq_max=6000))

si.set_global_job_kwargs(n_jobs=8)
recording_filtered.save_to_zarr(folder="/exports/eddie/scratch/chalcrow/harry_project/temp_zarr")
recording_filtered.save_to_folder(folder="/exports/eddie/scratch/chalcrow/harry_project/temp_binary")

This takes in an ~hour long recording, and then saves it twice. The output is:

write_zarr_recording
n_jobs=8 - samples_per_chunk=30,000 - chunk_memory=43.95 MiB - total_memory=351.56 MiB - chunk_duration=1.00s
write_zarr_recording: 100%|██████████| 4203/4203 [10:30<00:00,  6.67it/s]

write_binary_recording
n_jobs=8 - samples_per_chunk=30,000 - chunk_memory=43.95 MiB - total_memory=351.56 MiB - chunk_duration=1.00s
write_binary_recording: 100%|██████████| 4203/4203 [1:21:15<00:00,  1.16s/it]

So write_binary_recording takes 8x as long so it looks like it's not been parallelised. But weirdly, the code reports that it's using n_jobs=8 for both cases. Passing n_jobs directly to save_to_folder doesn't seem to help.

On my personal computer, there is no real difference in the timings between these methods.

Any ideas? Are the parallelisation scehems difference for the different file formats?

The text was updated successfully, but these errors were encountered:

zm711 · 2025-01-20T17:09:59Z

Do you see utilization of the cores change depending on n_jobs on Eddie? Like an htop or something?

samuelgarcia · 2025-01-21T09:17:43Z

The writing is 6x if I am reading correctly.

My guess is that the writting on a NFS drive is slow and then having a less to write could explain the diff.

Could you make some basic test of writting speed for this drive ?

Also what is the ratio between the 2 folders ?
My guess is that the compress ratio is 2.5 between and 3.5.
This do not explain the 6x ratio speed.

Maybe locking to write in the same file in binary case cost a lot for this filesystem.
If so could you use the threading implementation to check this ?

chrishalcrow · 2025-01-23T14:39:43Z

The zarr took 10.5mins and binary took 1hr21m15s=81.25mins. So the binary took 7.74 times longer.

I can track the wallclock time and cpu time on the HPC. I am running both save methods on one script. At the end of the zarr saving, we have

wallclock=00:10:13, cpu=01:16:51

meaning that in 10mins of human time, we used 76 mins of CPU time, which makes sense for a 8x parallelised piece of code. From now on, the code is saving the binary file. About ten minutes later, we have

wallclock=00:22:18, cpu=01:33:19

Meaning that only one cpu has been used in the last 10 mins.

So seems to be a parallelisation thing, for sure! Very confusing.

zm711 · 2025-01-24T13:45:18Z

Is it possible it's not the last ten minutes but the first?

        with open(file_path, "wb+") as file:
            # The previous implementation `file.truncate(file_size_bytes)` was slow on Windows (#3408)
            file.seek(file_size_bytes - 1)
            file.write(b"\0")


        assert Path(file_path).is_file()

We create the empty file first in order to write it for binary. And this is not using the multiprocessing. Just a stab in the dark. Do you want to time how long it takes to run just this code on your HPC to see if this might contribute?
and the link to the function in the file:

spikeinterface/src/spikeinterface/core/recording_tools.py

Lines 137 to 142 in 8aeaf9b

    
           with open(file_path, "wb+") as file: 
        
               # The previous implementation `file.truncate(file_size_bytes)` was slow on Windows (#3408) 
        
               file.seek(file_size_bytes - 1) 
        
               file.write(b"\0") 
        
           assert Path(file_path).is_file()

samuelgarcia · 2025-01-24T14:44:26Z

@zm711 : really good point.
This lazy file creating could depend on the file system and on HPC sometimes they have more complex file system that do not behave like ext4.

chrishalcrow added bug Something isn't working performance Performance issues/improvements labels Jan 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

On an HPC, saving to zarr is much faster than saving to binary #3630

On an HPC, saving to zarr is much faster than saving to binary #3630

chrishalcrow commented Jan 20, 2025

zm711 commented Jan 20, 2025

samuelgarcia commented Jan 21, 2025

chrishalcrow commented Jan 23, 2025

zm711 commented Jan 24, 2025 •

edited

Loading

samuelgarcia commented Jan 24, 2025

On an HPC, saving to zarr is much faster than saving to binary #3630

On an HPC, saving to zarr is much faster than saving to binary #3630

Comments

chrishalcrow commented Jan 20, 2025

zm711 commented Jan 20, 2025

samuelgarcia commented Jan 21, 2025

chrishalcrow commented Jan 23, 2025

zm711 commented Jan 24, 2025 • edited Loading

samuelgarcia commented Jan 24, 2025

zm711 commented Jan 24, 2025 •

edited

Loading