-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document chunking in nccreate #87
Comments
Compression of NetCDF files is only enabled if chunking is also enabled, |
Thanks for the super quick reply @jarlela! Ah I did not know about chunking (don't think it was in the documentation). I tried a typical 4 MiB chunk size with I thought maybe it was because I'm saving a very long vector so I'm now trying to save a 3D array. I magically found that it doesn't crash with N = 256
A = rand(N, N, N)
for cl in 0:9
tic = time_ns()
filename = "compress" * string(cl) * ".nc"
varname = "rands"
attribs = Dict("units" => "m/s")
nccreate(filename, varname,
"x1", collect(1:N), Dict("units"=>"m"),
"x2", collect(1:N), Dict("units"=>"m"),
"x3", collect(1:N), Dict("units"=>"m"),
atts=attribs, chunksize=(1,16,256), compress=cl)
ncwrite(A, filename, varname)
ncclose(filename)
toc = time_ns()
ts = prettytime(toc - tic)
fs = datasize(filesize(filename); style=:bin, format="%.3f")
println("Compression level $cl: $ts $fs")
end
|
Thanks for reporting, could you try this branch #88 ? On a side note, when running your example, you should provide the chunksize in Julia-ordered dimensions, so you probably wanted |
Hey @meggart thanks for looking into this. Tried #88 and it's working as expected now! N = 256
A = Float64.(rand(1:10, N, N, N))
for cl in 0:9
tic = time_ns()
filename = "compress" * string(cl) * ".nc"
varname = "rands"
attribs = Dict("units" => "m/s")
nccreate(filename, varname,
"x1", collect(1:N), Dict("units"=>"m"),
"x2", collect(1:N), Dict("units"=>"m"),
"x3", collect(1:N), Dict("units"=>"m"),
atts=attribs, chunksize=(256,16,1), compress=cl)
ncwrite(A, filename, varname)
ncclose(filename)
toc = time_ns()
ts = prettytime(toc - tic)
fs = datasize(filesize(filename); style=:bin, format="%.3f")
println("Compression level $cl: $ts $fs")
end
|
+1 for the suggestion above to document chunking. it's still not mentioned anywhere in docs/ nor in the docstrings, e.g.:
also, @meggart, can you please elaborate what did you meant by:
what aspect of performance is improved if the chunk is bigger in the first axis? compression ratio, read time, something else? thanks! |
Ok, I have re-opened and changed the title of the issue |
Please let me know if I'm doing this wrong but I was trying to find a nice balance between compression time and file size by benchmarking compression levels 0-9 but instead find that the compression level does nothing.
The text was updated successfully, but these errors were encountered: