Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP - OpenAI Assistants Agent #4131

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

lspinheiro
Copy link
Collaborator

Why are these changes needed?

Related issue number

Checks

@lspinheiro
Copy link
Collaborator Author

@ekzhu @jackgerrits , this is a very early draft, I have some questions before proceeding further.

  1. What to do w.r.t. model client? The chat completion client abstraction doesn't seem to fit well because it seems to have some assumptions about the handling of messages in the interface that the assistants api has a very different approach of handling with the threads (I have spent a lot of time trying to adapt it without success). I'm also not sure if we can generate some general interface for agent-like apis, should I even create a specific one in autogen_ext to abstract away the openai sdk? I'm not sure what would be the value in that but it also feels like I'm adding an implementation without ap roper standard/abstraction.

  2. How do we want to handle file search, specially ingestion? Also seems like something we don't have a strong abstraction for. I'm not sure if it should fit into how we will integrate rag or not. I also don't know if we want file ingestion to be part of the agent interaction api through on messages or not (maybe just a separate method to be used in the agent set up before running the chat?)

  3. The next step for me is to map the tool calling to the autogen core framework. It looks like the azure openai api has integration with logic apps to actually call functions as tools. Should this be a future functionality in autogen_ext?

@ekzhu
Copy link
Collaborator

ekzhu commented Nov 11, 2024

Thanks. I think we can follow the design in the Core cookbook for open ai assistant agent: https://microsoft.github.io/autogen/dev/user-guide/core-user-guide/cookbook/openai-assistant-agent.html. The API should be simple without introducing additional abstractions on our side.

class OpenAIAssistantAgent:
  name: str
  description: str
  client: openai.AsyncClient
  assistant_id: str
  thread_id: str
  tools: List[Tools] | None = None,
  code_interperter: ... | None = None, # configuration class from OpenAI client
  file_search: ... | None = None, # Configuration class from OpenAI client

We don't need to introduce additional abstractions because OpenAI Assistant is specific to OpenAI and Azure OpenAI services -- we should stick with the official clients they provide. Furthermore, we shouldn't expect the agent to be the only interface to the assistant features such as file search, and as the application may also perform other functions such as file upload and thread management.

  1. What to do w.r.t. model client? The chat completion client abstraction doesn't seem to fit well because it seems to have some assumptions about the handling of messages in the interface that the assistants api has a very different approach of handling with the threads (I have spent a lot of time trying to adapt it without success). I'm also not sure if we can generate some general interface for agent-like apis, should I even create a specific one in autogen_ext to abstract away the openai sdk? I'm not sure what would be the value in that but it also feels like I'm adding an implementation without ap roper standard/abstraction.

Use the official openai client and do not introduce new abstractions on our side besides the new agent class.

  1. How do we want to handle file search, specially ingestion? Also seems like something we don't have a strong abstraction for. I'm not sure if it should fit into how we will integrate rag or not. I also don't know if we want file ingestion to be part of the agent interaction api through on messages or not (maybe just a separate method to be used in the agent set up before running the chat?)

This should be mostly done using the official openai client in user's application. We can potentially add new assistant tools that can use the client.

  1. The next step for me is to map the tool calling to the autogen core framework. It looks like the azure openai api has integration with logic apps to actually call functions as tools. Should this be a future functionality in autogen_ext?

We should make sure we can use our Tool class tools in this new agent.

Overall, the goal is to bring OpenAI assistant agents into our ecosystem, not to build a new wrapper around assistant API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants