-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Roadmap] Alibaba Cloud (Qwen) support #759
Comments
@KenMacD Alibaba Cloud Qwen models are fully supported. Enumeration is automatic, you only need to enter the API key - and optionally you can change the host. Uses the OpenAI compatibility service. This includes cost computation, etc (note that I found out that there's caching behind the scenes but there's no discounted pricing for Caching for now). Please give it a deep test and let me know if all is working to specifications. |
Wow, that was quick. Thanks! Initial testing around prompts and files have all worked perfectly. |
* Together: update models * Together: update models * OpenRouter: update visibility * OpenRouter: support reasoning sideband * Deepseek: better namings * Deepseek: fix assistant message alternation * Relax status check for Azure Openai. Fixes enricoros#744 * OpenRouter: extract models functions * Together: note * Azure: move models function * OpenPipe: extract models file * Ollama: add description * Ollama: update models * Ollama: match vision support * Autocomplete the tags * Mistral: update models * Mistral: hide symlinks * Add Mistral-3 (24B) * Fix Autocomplete issue * 1.92.0-RC1 * /tools folder * Optima: optimize, add 'gone' functionality * Fix Mobile Open Pane unnecessary padding * Docs: add a Data Ownership guide * Ctrl+L: attach web link * OpenAI o3: models update * OpenAI: sorted models * OpenAI: models sorting * OpenAI: models change visibility * OpenAI o3: strip images * OpenAI o3: max_completion_tokens and developer message * OpenAI o3: namings. Support complete. * DeepSeek: reasoning hint * Beam: brain-ready * Perplexity: add Sonar Reasoning * Gemini: undocumented safety * Thinking: auto-detect blocks * Composer: fix dependency * Update README.md * Update README.md * Mo ar re al * LocalAI: improve naming, interfaces * Update README.md * Quick update * Create help-faq.md * Update help-faq.md * Link FAQs * Fix link * ChatDrawer: sync once a minute so we don't get unexpected regroup flashes * OpenAI: chatgpt-4o-latest doesn't support tools * Ollama: JSON mode is dangerous, say it. Fixes enricoros#749 * Attach: auto-detect URLs * Attach: auto-detect simplify (one button instead of N) * OpenAI: restore markdown even of missing developer messages * LLMs: OpenAI: decouple reasoning effort an restore markdown * Small ux hint * Gemini: update models * link ssl3 for builder * Models config: small ux fix * Models config: improve costs display * Models config: improve costs display again * Models config: improve add service ux * LocalAI: mark one more * Modal: add darken bottom * Models list: verbiage * Models modal: simplify (disable the 'all services' button) * Add icon * Push down: cml background * Anthropic: less intrusive fallback message * Wizard: Models * Reconfigure All Models on hash changes * Wizard: improve selectors * StorageUtils: improve display * Anthropic: minor status message update * Mistral: improve * Ollma: improve type * LocalAI: large UI improvement * Wizard: improve first time experience * Wizard: support Local vendors * LocalAI: fix a p > div * Wizard: support 'defaults' * LLM Select: ensure a min width of 96px, and break words if push comes to shove * Chat AI: change utility model * AiFn: disabled summarize * Update MCT * LLMs: extract assignments slice * LLMs: rename .service.types * ModelAux: disable button (prob no effect) * Models: update benchmark scores * LLMs: per-domain configuration * LLMs: port select and options * LLMs: port the llm dropdown * LLMs: update the select * LLMs: roll models * LLMs: ModelsList for domains * LLMs: port useFormRadioLlmType * LLMs: bits * LLMs: remove useChatLLM for good * LLMs: adapt PersonaSelector * LLMs: improve autoconfig * Improve multichat icon * chat-store: merge (not replace) conversations from storage * Groq: update models * Improve multichat on mobile * Add Toggle * o1: re-enable streaming now that OAI supports it * Stores: cleanup * Pane Manager: cleanup * Panes: add an empty split when not branching * Panes: Zero notices * Panel: Zero improvement * Update text * Space between radios * Diagram - improve title * AIX: capitalize dialect in exceptions * Azure: rename to Azure OpenAI. enricoros#757 * Azure: add note about AI Foundry. enricoros#757 * Types: immutable (deeply) * roll packages * roll residuals * Dockerfile: new env=value format * Dockerfile: build information * Dockerfile: deployment type * GA: remove @next/third-parties/google * GA: application build stats * Notice on approximate tokenizer * FireworksAI: support via custom OpenAI on https://api.fireworks.ai/inference * FireworksAI: small doc change * Empty Inline Links renderer * Shortcuts: fix jumpiness * xAI: update models * Block Editor: set FORCE_ENTER_IS_NEWLINE=undefined in the code to disable Shift+Enter to save, and follow the App preferences instead. Fixes enricoros#760. * Alibaba Cloud support, incl Qwen Max, Plus, Turbo. Fixes enricoros#759 * Alibaba: fix pricing * Deepseek: update prices * Groq: update models pricing * OpenAI: small text updates * Ollama: update models * Perplexity: update models * Move GA * Fw compat key name * Rename TenantSlug * Remove App.pl * Settings: update * Nav: breadcrumbs * Nav: strings * Shortcuts: Esc comes first * Mic: disable focus on the Composer Textarea while active * Mic: Enter/Ctrl+Enter interceptors to Send/Beam * Revert "Mic: Enter/Ctrl+Enter interceptors to Send/Beam" This reverts commit 93f2cf4. * LLMs: get from domain * Code model editing. * Gemini: thinking models do not do FC * autoChatFollowUps: code model only * FormLabelStart: support warnings * Advanced AI settings: improve all settings * RenderCode: fix fullscreen * Gemini: fix model capabilities * LLM Attachments: stay in tooltip * Reconfigure Code/Fast if not present after a full reconfig. * LLM domain autoconfiguration includes the function calling detection * LLM domain capabilities checking: warn about proceeding with a LLM without requirements, but don't bail * Show last used chat mode in dev settings. * Fix max/fullscreen icons * Optima: Side Paneling * Optima: large UI cleanups * Big-AGI logos * Backport smallie * Misc simplify * Auto-scale side menu * Fix port * Code Icon * Remove unused * Beam: don't re-run when ctrl+enter when editing * Phosphor: add settings * Optima: export dropdown slotProps * Add FormChipControl: swappable for the Radio Control * T2I settings: use chips for the active service * T2I settings: remove popup, overflows on mobile * Draw/Provider: rename * Draw/Provider: share style * roll packages * AppChat: Draw: inline enhancements * AppChat: Draw: suspend other elements * AppChat: Draw: support N images * AppChat: Draw: "draw options" on desktop * Imagine: fix prompt and algo * Nav: disable incomplete * Fix latext/markdown rendering: preserve leading space when re-encoding for 'remark-math'. Fixes enricoros#763 * LLMs: fix 'buttons can wrap' * Fragments: support placeholders with purpose * Uniform model icons * No tips on draw * Draw: image settings * Draw: improve # * Draw: fix * BeamView: comment for LLMs * ERC: fix overlapping menus and non-closing menus on rmb click * CloseablePopup: memo * LLM Options: just slight better display * Anthropic: update models * Anthropic: update 3.7 output size * Anthropic: auto-created-date * Anthropic: 3.7 dMessageUtils * Anthropic: improve flags composition * LLMs: enable model variants * LLMs: define, edit, and optionally spec the vendor model parameter 'Anthropic thinking budget' * LLMs: Anthropic: add the Thinking variant * AIX: improve user-visible message * AIX: Anthropic: adapter misc (Documents, unused for now) This pairs with the Citations mechanism, that's not yet added to the wires. * AIX: Anthropic: framework support for Thinking Budget (nullable number) * AIX: Anthropic: adapter support for the Thinking Budget * AIX: Anthropic: wire Request: Thinking blocks * AIX: Anthropic: wire Response: Thinking/RedactedThinking blocks - NOT matched by AixWire_Particles AND NOR AixWire_Parts * LLMs: don't control temperature when controlling Anthropic's Thinking Budget (temp=1) * Chat AI settings: categories * Chat AI settings: renames * Chat AI: keep last Thinking block only (default) * LLMs: document interfaces * Fragments: small fix * Fragments: finalize the Aux fragment * AIX: mirror the Aux fragment * AIX: TRR particle transmitter/reassembler * AIX: Anthropic: parser: exhaustive checks * AIX: Anthropic: parser: S/NS TRR particles * Render Block parts * AIX: TRR particle reassembler fix * Fragments: have to deal with this string[] * AIX: Dispatch/CGR: adapters for Thinking Blocks (only Anthropic is implemented) Note: the ModelAux/reasoning block is only sent if there's a signature or there is redacted data. We could even further reduce its sending to only Anthropic llms in CGR. * Fragments: fix types * LLM Params Editor: support simplify * FormLabelStart: optimize * Incognito: improve appearance * React: fix useRef for React 19 * roll: Types for React 19 * roll: Lock NextJS to 15.1 * roll: misc deep * AppChat: improve borders * Optima Dropdown: faster, better style * LLM types: small sort * OpenAI: official 4.5 support * Metrics: store dtStart and vTOutInner where available * Metrics: render tok/s and wait * Metrics: improve render * Metrics: hmm * Metrics: compensate reasoning tokens * Metrics: require at least 40 tokens to compute speed (and it's a very low bar * Metrics: show the speed section also if the wait exceeded 10 seconds * OpenAI: fix model order --------- Co-authored-by: Enrico Ros <[email protected]> Co-authored-by: Jay Chen <[email protected]>
Why
Allow users to use Qwen models directly available through the Alibaba Cloud.
Description
Alibaba is the worlds third largest cloud. The AI models by the group are available at both an OpenAI-Compatible endpoint and their own SDK.
At the moment the
https://dashscope-intl.aliyuncs.com/compatible-mode/v1/models
lists follow, with the context and parameters being available in the docs.The text was updated successfully, but these errors were encountered: