Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add raw media type for model weights files #35

Merged
merged 1 commit into from
Feb 26, 2025
Merged

Conversation

aftersnow
Copy link
Contributor

@aftersnow aftersnow commented Feb 20, 2025

This commit introduces a application/vnd.cnai.model.weight.v1.raw media type. This will be beneficial for large language models, as the model weight files are giant, with individual files reaching up to 10GB and total files nearing 1TB:

  • Eliminates compression and archiving overhead, and accelerates build and start speed
  • Reduces storage footprint during archiving and unarchiving

Similar PR in oci-spec: https://github.com/opencontainers/image-spec/pull/1197/files

Copy link

@amisevsk amisevsk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not against adding something like this to the spec. Maybe we should consider using a more common media-type to represent that it's just a binary blob, though -- e.g. application/vnd.cnai.model.weight.v1.octet-stream

@aftersnow
Copy link
Contributor Author

I'm not against adding something like this to the spec. Maybe we should consider using a more common media-type to represent that it's just a binary blob, though -- e.g. application/vnd.cnai.model.weight.v1.octet-stream

Thanks for your suggestion~ My intention with using "raw" was to convey that it hasn't been compressed or TARed. "Octet-stream" also makes sense, but from an ease-of-understanding perspective, "raw" might be slightly more intuitive. Do you think that is OK, or am I still missing something?

Copy link
Contributor

@gaius-qi gaius-qi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aftersnow
Copy link
Contributor Author

Copy link
Contributor

@gaius-qi gaius-qi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@chlins chlins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@chlins chlins merged commit c2ed250 into main Feb 26, 2025
2 checks passed
@chlins chlins deleted the add-raw-media-type branch February 26, 2025 12:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants