Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate Vision Support and Image Upload Handling Using Cloudflare R2 #51

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

jonathanfan-ee
Copy link

This PR enhances the existing models and backend logic, focusing mainly on adding visual capabilities and resolving image upload issues.

Summary of Changes:

  1. Visual Attributes of OpenAI Models: The initial goal was to enable the visual capabilities of OpenAI models by adding visual attributes. This update allows users to send image-based inputs to the model, facilitating the use of vision-based functionalities.

  2. File Uploads Processed through Cloudflare R2:

    • The /api/v1/ai/files endpoint now supports proxying file uploads to Cloudflare R2 instead of using the official AWS S3. This change is designed to allow users to configure independently (Cloudflare R2 setup is relatively simple) and take advantage of Cloudflare's generous free storage quota.
    • The request and response formats have been adjusted to match the official Raycast API's handling of image uploads, ensuring compatibility.
  3. Handling Attachments in chat_completions:

    • The message-building logic in the __build_openai_messages function has been modified. Now, when an attachment ID is received, the corresponding file URL is generated using the public Cloudflare R2 bucket, ensuring the image is available for model processing.
    • This update ensures that images are correctly passed to the OpenAI API in a supported format (image_url) as part of the user request.
  4. Environment Variables for Public Access and Cloudflare R2 Configuration:

    • Several environment variables have been added to support Cloudflare R2 file storage and public access URL configuration:
      • PUBLIC_ACCESS_URL: Stores the public domain name used to access uploaded images. This URL dynamically appends the file key to generate public links.
      • CLOUDFLARE_R2_ACCESS_KEY_ID
      • CLOUDFLARE_R2_SECRET_ACCESS_KEY
      • CLOUDFLARE_R2_BUCKET_NAME
      • CLOUDFLARE_R2_ACCOUNT_ID

该 PR 对现有模型和后端逻辑进行了增强,主要集中在添加视觉能力和解决图像上传问题。

更改摘要:

  1. OpenAI 模型的视觉属性: 最初的目标是通过添加视觉属性来启用 OpenAI 模型的视觉能力。此更新允许用户将基于图像的输入发送到模型,从而促进基于视觉的功能的使用。
  2. 通过 Cloudflare R2 处理文件上传:
    • /api/v1/ai/files 端点现在支持将文件上传代理到 Cloudflare R2,而不是使用官方的 AWS S3。 此变更旨在方便用户独立配置(cloudflare R2 开通较为简单)并利用 Cloudflare 慷慨的免费存储配额。
    • 请求和响应格式已调整,以匹配官方 Raycast API 对图像上传的处理,确保兼容性。
  3. chat_completions 中处理附件:
    • 修改了 __build_openai_messages 函数中的消息构建逻辑。现在,当接收到附件 ID 时,会使用公共 Cloudflare R2 存储桶生成相应的文件 URL,确保图像可供模型处理。
    • 此更新确保图像以支持的格式(image_url)正确传递给 OpenAI API,作为用户请求的一部分。
  4. 环境变量用于公开访问和 Cloudflare R2 配置:
    • 添加了几个环境变量来支持 Cloudflare R2 文件存储和公开访问 URL 配置:
      • PUBLIC_ACCESS_URL: 存储用于访问上传图像的公共域名。此 URL 会动态附加文件密钥以生成公开链接。
      • CLOUDFLARE_R2_ACCESS_KEY_ID
      • CLOUDFLARE_R2_SECRET_ACCESS_KEY
      • CLOUDFLARE_R2_BUCKET_NAME
      • CLOUDFLARE_R2_ACCOUNT_ID

…ta, add Vision support in capabilities, and include o1 series models

对照官方原版信息,更新 `model.py` 中的 OpenAI 模型信息和上下文窗口大小,添加能力中的Vision 支持,并增加 o1 系列模型
… a model info dictionary for OpenAI models

重构 `_get_model_extra_info` 函数,通过使用模型信息字典减少重复代码,并提高代码可维护性。
…nt, update dependencies with imgurpython and aiofiles

为 /api/v1/ai/files 端点添加 imgur 图片上传功能,更新依赖aiofiles
…ponse for image upload initialization

修改 /api/v1/ai/files 端点,返回模拟的成功响应以初始化图像上传
…ndpoint, supporting client file uploads to Cloudflare R2.

模拟官方 /api/v1/ai/files 端点的响应格式,支持客户端文件上传到 Cloudflare R2
…ils.py`, modify the OpenAI message construction function to match the file URL based on the attachment ID, and ensure that image attachments can be uploaded properly.

将图片附件处理整合到现有的 utils.py 中,修改 OpenAI 消息构建函数,根据附件 ID 匹配文件 URL,确保图片附件能够正常上传。
…struct the public access URL for the file, Adjust the model attribute list.

使用环境变量中的公共域名来构建文件的公开访问 URL, 调整模型列表。
尝试将环境变量max_tokens读取值转换为整数,如果失败则使用默认值 1024。如果转换失败,记录警告并使用默认值 1024。
@yufeikang
Copy link
Owner

非常感谢你的贡献。我会在这周内尽快review你的代码。谢谢

@yufeikang yufeikang self-assigned this Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants