Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify to use the .pkl.gz extension #31

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

yuzie007
Copy link
Contributor

Based on #30

Summary

This suggests using the .pkl.gz extension rather than the present .pckl.gzip extension for dataset files.

Motivation

There are several advantages to do so:

  • .gz is the standard extension for Gzip files (see https://www.gnu.org/software/gzip/manual/gzip.html and https://en.wikipedia.org/wiki/Gzip).
  • With the .gz extension, df.to_pickle and pd.read_pickle can automatically decide to use Gzip for compression/expansion without explicitly specifying the compression argument.
  • Thanks to the above, we can remove the compression argument. This also allows users to use their preferred compression formats like .bz2.

It would be also nicer to use .pkl (or .pickle) rather than pckl because

  • Pandas uses the .pkl extension in their examples (see df.to_pickle and pd.read_pickle).
  • Much more usage of .pkl.gz can be found on web in general than .pckl.gz. So people can more easily notice that this is in the pickle format.

For already made pckl.gzip files, we can simply rename them.

Other changes

From an example in docs/pacemaker/quickstart.md, I removed the protocol argument from pd.read_pickle (this argument exists for df.to_pickle but not for pd.read_pickle).

This argument exists for `df.to_pickle` but not for `pd.read_pickle`.
rather than the present `.pckl.gzip` extension for dataset files.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant