You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First I wanted to say thanks for maintaining this package, it's great to have Parquet support with Julia.
I noticed that there seems to be a memory leak when I read parquet files that use ZSTD compression. The easiest way for me to reproduce the issue was to create a parquet file and then repeatedly read it in Julia while monitoring memory usage.
Creating the file in Python with pyarrow (I wasn't sure how to create a similar file in Julia):
Reading the same file a few times and monitoring memory usage:
using DataFrames
using Parquet
functionBuildDF()::Int64
df =DataFrame(read_parquet("/tmp/mem.parquet"))
return1endfor i in1:10BuildDF()
GC.gc()
run(`ps -p $(getpid()) -h -o rss`)
end
Changing the compression type of the parquet file to Snappy or uncompressed doesn't show the same memory growth. GZip compression shows some growth but not as large as ZSTD.
I'm wasn't able to dig deeper to see where the memory usage may be coming from. Any ideas?
The text was updated successfully, but these errors were encountered:
First I wanted to say thanks for maintaining this package, it's great to have Parquet support with Julia.
I noticed that there seems to be a memory leak when I read parquet files that use ZSTD compression. The easiest way for me to reproduce the issue was to create a parquet file and then repeatedly read it in Julia while monitoring memory usage.
Creating the file in Python with pyarrow (I wasn't sure how to create a similar file in Julia):
Reading the same file a few times and monitoring memory usage:
I see output like this:
Changing the compression type of the parquet file to Snappy or uncompressed doesn't show the same memory growth. GZip compression shows some growth but not as large as ZSTD.
I'm wasn't able to dig deeper to see where the memory usage may be coming from. Any ideas?
The text was updated successfully, but these errors were encountered: