Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chatbot that remembers conversation history #64

Closed
rchan26 opened this issue Aug 31, 2023 · 3 comments
Closed

Chatbot that remembers conversation history #64

rchan26 opened this issue Aug 31, 2023 · 3 comments
Assignees

Comments

@rchan26
Copy link
Collaborator

rchan26 commented Aug 31, 2023

In the current llama-index model, the conversation history is not tracked and so each question also queries the database for an answer. It would be interesting to investigate how we can have a conversation with the data (multiple back-and-forth instead of a single question and answer).

Looking at the llama-index documentation, it looks like it has some ability to do this: https://gpt-index.readthedocs.io/en/latest/core_modules/query_modules/chat_engines/root.html

Would need to replace the query_engine calls with chat_engine. Would also need to play around with something like the ReAct Agent (llama-index have a few implemented) which decides how the chatbot will interact with the database during the conversation.

@rwood-97
Copy link
Contributor

rwood-97 commented Sep 1, 2023

See here for examples of using chat engine.
The 'condense_question' and 'context' modes seems to be the ones which forces llama2 to use the query engine (i.e. the database data vs just pre-trained/pre-existing knowledge).

@rwood-97
Copy link
Contributor

rwood-97 commented Sep 1, 2023

Have played around with this a bit more in a new notebook here.

I think 'context' basically finds/retrieves a load of context info from our database and then uses that to answer the question (i.e. the model is called once per 'chat' and its essentially "heres a load of context, can you answer this").
The 'condense_question' seems to be more like just using the query engine (i.e. the model is called multiple times, once for each piece of context). For the first query I think it is basically just the same as query engine. But then if you follow up, it would be 'condensing' your follow up question with the chat history and then using that as the new query for the query engine.

Overall, I think 'context' mode seems better.

@rchan26
Copy link
Collaborator Author

rchan26 commented Sep 11, 2023

Some examples of using the chat engine from #66. Will continue with using the "context" engine as it seems the most consistent engine.

React seems to be quite volatile and doesn't always makes the best decisions in determining whether or not to use the query engine. But it does note here that it really depends on the quality of the LLM. We do get better performance using 13b models over 7b (quantized), so perhaps could be better in the future if we have access to higher quality quantized LLMs.

While working on this, we noticed an issue with the prompt creation in the chat engine. This has been fixed in this PR by @rwood-97 and I.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants