-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support building graphs from MLTensor
containing constants
#760
Comments
@a-sully @RafaelCintron @fdwr @huningxin appreciate any feedback |
I'd need more time to think for meaningful feedback, but it may be rather confusing having this list of methods o_o:
|
Thanks for the quick feedback. I simplified the proposal even further via initializer + new usage bit. |
Definitely interested in this from the point of view of caching models as well (especially weights that might be used by both WebGPU and WebNN implementations). |
Revised the API design based on PoC feedback. |
Overall approach mostly LGTM. Feedback here is mostly cosmetic except for #2 (and some of which I've already mentioned in comments on your prototype CL) 1.
|
MLConstantTensor |
Multi-graph build() |
---|---|
❌ Weight lifecycle management is the exposed to developers | ✅ All graphs using these weights are known upfront, so no need for explicit create/destroy API surfaces |
✅ More naturally matches how ML frameworks generally load weights | ❌ May be harder for web ML frameworks to utilize |
✅ Graph compilation costs may be spread out as the developer sees fit | ❌ All graphs sharing weights are compiled at once, which may exacerbate issues of slow compilation |
✅ If graph compilation fails, weights do not need to be re-uploaded | ❌ Not obvious how the API should behave if some fraction of graphs fail to compile |
✅ Logical decoupling of weights from the model architecture | ❌ API suggests weights are tightly coupled to a model (which they might be under the hood, but that's an implementation detail) |
As listed here, the left column does seem superior :) Just wanted to note this to be explicit about the tradeoffs we're making
Thanks for the summary @a-sully.
SGTM.
We need a way to communicate to web developers that 'constant' tensors are read-only and non-dispatchable tensors. The WebGPU approach would be to introduce a new |
Demonstrate how
MLTensor
can be used to help web developers manage constant data (e.g., trained weights) on-device.Dependent PRs
MLConstantOperand
: Do we need anMLConstantOperand
? #668 (comment)MLTensor
: Add MLTensor explainer #754Motivation
Design
MLTensor
containing constant data will be associated upon creating theMLOperand
. At build(), the constant data will be forwarded into the device. The original constant data (ie.ArrayBuffer
input or uploaded device data held byMLTensor
) can be discarded immediately after createConstant() succeeds.Example JS
Proposed IDL
Edits:
MLOperandDescriptor
as required byMLOperand
The text was updated successfully, but these errors were encountered: