WIP: Granite model updates, and questions about best use of the MOF #66
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi 👋🏻 - thanks for all the work that's gone into this. There's clearly been a lot of thought about what makes a model open.
I'm trying to update the information regarding the IBM Granite models, and I'm finding that I have several thoughts along the way about how the MOT can most effectively represent the considerable work that we and others have put into openness.
This is a WIP PR, not intended to be committed as-is; I do not think it currently conforms to the schema.
(Let me know if this should be resolved as an issue / discussion forum before creating the PR. I thought it was best to share the questions alongside my attempts to answer within framework.)
High-level queries
Many of the entries, such as those related to evaluation and datasets, have partial-to-full answers posted on the HuggingFace model card. In order to simplify updates, would it be possible to scrape this information from the HF model card? Training datasets and evaluation metrics are represented in the YAML at the top of the Granite HF README.md.
How does the MOT want us to handle versioning? I see the existing Granite files are split based on number of parameters - but there are a plethora of models to date, with different architectural versions, fine-tuning targets, modalities, and number of parameters.
For at least the Granite family of models, the most appropriate distinction would be on architectural versions, fine-tuning targets, and modalities, and openness is likely not to vary according to the number of parameters.
Several OSS frameworks exist to perform inference on models with open weights, and Granite is compatible with many of them. How can we best represent that information?
How can we represent ongoing updates to transparency in the form of further research papers, technical reports, presentations, etc?
More specific questions are included in the PR itself.
Thanks! Excited to see this project continue.