Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Inference snippets] Templated snippets for inference snippet generation #1255

Open
wants to merge 25 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 20 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -17,5 +17,8 @@
"search.exclude": {
"**/dist": true
},
"typescript.tsdk": "node_modules/typescript/lib"
"typescript.tsdk": "node_modules/typescript/lib",
"[handlebars]": {
"editor.defaultFormatter": "mfeckies.handlebars-formatter"
}
}
4 changes: 3 additions & 1 deletion packages/inference/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -52,9 +52,11 @@
"check": "tsc"
},
"dependencies": {
"@huggingface/tasks": "workspace:^"
"@huggingface/tasks": "workspace:^",
"handlebars": "^4.7.8"
Copy link
Member

@julien-c julien-c Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @xenova

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be awesome! 🤩 Although the library was originally designed for ChatML templates, the set of available features should be large enough for these templates.

Maybe @Wauplin can explain what set of features would be required? 👀

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically just ifs and variable replacement from what i've seen

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handlebars has pretty much a feature set of 0.00

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea I'm really wary of adding more deps, if we can use jinja it would be great (and we could use jinja for more things)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update my PR tomorrow in that direction. As Julien said, I'm not using anything fancy at all so jinja will be more than enough for the job

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! Let me know if I can help in any way 🫡

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced the handlebars dependency by huggingface/jinja and it works like a charm! 86e787a Thanks for the package @xenova! 🤗

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well I now have error Couldn't find package "@huggingface/jinja@^0.3.3" required by "@huggingface/inference@*" on the "npm" registry. in the CI though jinja 0.3.3 is available on https://www.npmjs.com/package/@huggingface/jinja 🤔

},
"devDependencies": {
"@types/handlebars": "^4.1.0",
"@types/node": "18.13.0"
},
"resolutions": {}
Expand Down
45 changes: 45 additions & 0 deletions packages/inference/pnpm-lock.yaml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

24 changes: 14 additions & 10 deletions packages/inference/src/lib/makeRequestOptions.ts
Original file line number Diff line number Diff line change
Expand Up @@ -56,13 +56,15 @@ export async function makeRequestOptions(
/** In most cases (unless we pass a endpointUrl) we know the task */
task?: InferenceTask;
chatCompletion?: boolean;
/* Used internally to generate inference snippets (in which case model mapping is done separately) */
__skipModelIdResolution?: boolean;
}
): Promise<{ url: string; info: RequestInit }> {
const { accessToken, endpointUrl, provider: maybeProvider, model: maybeModel, ...remainingArgs } = args;
const provider = maybeProvider ?? "hf-inference";
const providerConfig = providerConfigs[provider];

const { includeCredentials, task, chatCompletion, signal } = options ?? {};
const { includeCredentials, task, chatCompletion, signal, __skipModelIdResolution } = options ?? {};

if (endpointUrl && provider !== "hf-inference") {
throw new Error(`Cannot use endpointUrl with a third-party provider.`);
Expand All @@ -81,15 +83,17 @@ export async function makeRequestOptions(
}
// eslint-disable-next-line @typescript-eslint/no-non-null-assertion
const hfModel = maybeModel ?? (await loadDefaultModel(task!));
const model = providerConfig.clientSideRoutingOnly
? // eslint-disable-next-line @typescript-eslint/no-non-null-assertion
removeProviderPrefix(maybeModel!, provider)
: // For closed-models API providers, one needs to pass the model ID directly (e.g. "gpt-3.5-turbo")
await getProviderModelId({ model: hfModel, provider }, args, {
task,
chatCompletion,
fetch: options?.fetch,
});
const model = __skipModelIdResolution
? hfModel
: providerConfig.clientSideRoutingOnly
? // eslint-disable-next-line @typescript-eslint/no-non-null-assertion
removeProviderPrefix(maybeModel!, provider)
: // For closed-models API providers, one needs to pass the model ID directly (e.g. "gpt-3.5-turbo")
await getProviderModelId({ model: hfModel, provider }, args, {
task,
chatCompletion,
fetch: options?.fetch,
});

const authMethod = (() => {
if (providerConfig.clientSideRoutingOnly) {
Expand Down
Loading
Loading