feat: support query rewrite for first sentence #436

NingLu · 2024-11-15T03:30:41Z

🤖 AI-Generated PR Description (Powered by Amazon Bedrock)

Description

This pull request includes modifications to various files in the common_logic and langchain_integration directories. The changes primarily involve refactoring and optimizing the existing codebase, with a focus on improving code readability, maintainability, and performance.

Several utility functions and classes have been updated to enhance their functionality and address potential issues or inefficiencies. Additionally, some of the LangChain integration components, such as chains and chat models, have been modified to align with the latest best practices and requirements.

One file, source/lambda/online/common_logic/langchain_integration/chains/stepback_chain.py, has encountered an error during the changes. This issue will need to be addressed and resolved before merging the pull request.

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

File Stats Summary

File number involved in this PR: 29, unfold to see the details:

The file changes summary is as follows:

Files	Changes	Change Summary
source/lambda/online/common_logic/common_utils/logger_utils.py	0 added, 1 removed	The code removes an unnecessary blank line at the end of the function definition.
source/lambda/online/common_logic/langchain_integration/chains/__llm_chain_base.py	0 added, 1 removed	The code change removes an empty line at the end of the `get_chain` function, which is a minor formatting adjustment.
source/lambda/online/common_logic/langchain_integration/chains/query_rewrite_chain.py	4 added, 2 removed	This code change modifies the create_chain method by adding line breaks for better readability and applying a lambda function to the chain for post-processing the query rewrite.
source/lambda/online/common_logic/common_utils/s3_utils.py	12 added, 3 removed	The code changes include formatting improvements through better indentation and line breaks, as well as adding empty lines between function definitions for improved readability.
source/lambda/online/common_logic/common_utils/langchain_utils.py	1 added, 1 removed	The code change modifies the type annotation for the 'keys' key in the 'NestUpdateState' TypedDict, using the 'update_nest_dict' type annotation.
source/lambda/online/common_logic/common_utils/time_utils.py	4 added, 3 removed	The code changes involve formatting modifications: splitting a long line into multiple lines for better readability and fixing indentation for the timezone declaration.
source/infrastructure/lib/knowledge-base/knowledge-base-stack.ts	4 added, 4 removed	The code changes involve creating a Glue job for knowledge base processing, including deploying Python dependencies to an S3 bucket and configuring an IAM role for the Glue job.
source/lambda/online/common_logic/langchain_integration/chains/marketing_chains/mkt_conversation_summary.py	11 added, 5 removed	The code changes introduce a new class `Internlm2Chat7BMKTConversationSummaryChain` and `Claude2MKTConversationSummaryChain` for summarizing conversations, along with modifications to the prompt creation and chain creation methods.
source/lambda/online/common_logic/common_utils/monitor_utils.py	2 added, 2 removed	The code changes modify the formatting of the QQ match result and RAG data in Markdown tables, separating the QQ match result into a distinct section with additional columns for Source File Name and Source URI.
source/lambda/online/common_logic/common_utils/python_utils.py	2 added, 1 removed	The code change adds type hints for function parameters and imports the collections.abc module for checking if a value is a mapping (dictionary-like object).
source/lambda/online/common_logic/langchain_integration/chains/retail_chains/auto_evaluation_chain.py	14 added, 16 removed	The code changes involve importing necessary modules, defining a new class `Claude2AutoEvaluationChain` that inherits from `Claude2ChatChain`, and creating subclasses for different Claude model variants to perform auto-evaluation tasks.
source/lambda/online/common_logic/common_utils/lambda_invoke_utils.py	13 added, 11 removed	The code changes involve formatting updates, such as adding spaces around operators, consistent use of whitespace, and wrapping long lines to improve code readability and maintainability.
source/lambda/online/common_logic/langchain_integration/chains/marketing_chains/mkt_rag_chain.py	8 added, 6 removed	The code changes involve registering prompt templates for the InternLM2 Chat 7B and 20B models for the MTK_RAG task type, defining a custom prompt template for an AWS customer service assistant, creating a Internlm2Chat7BKnowledgeQaChain class that inherits from Internlm2Chat7BChatChain, and defining a Internlm2Chat20BKnowledgeQaChain class that inherits from Internlm2Chat7BKnowledgeQaChain with a different model ID.
source/lambda/online/common_logic/common_utils/constant.py	2 added, 4 removed	This code change introduces constants and enumerations for various purposes, such as LLM task types, message types, LLM model types, embedding model types, index tags, knowledge base types, thresholds for question answering and knowledge retrieval, and error messages related to guide intentions and index descriptions.
source/lambda/online/common_logic/langchain_integration/chains/translate_chain.py	2 added, 1 removed	This code change modifies the create_chain method to apply a postprocessing step that removes leading and trailing double quotes from the output of the LLM chain.
source/lambda/online/common_logic/langchain_integration/chains/chat_chain.py	60 added, 46 removed	The code changes involve various updates to the ChatChain classes for different language models, including adding support for system prompts, chat history handling, and customizing prompt templates.
source/lambda/online/common_logic/langchain_integration/chat_models/bedrock_models.py	9 added, 9 removed	The code changes include minor formatting updates, adding new LLM model classes (MistralLarge2407, Llama3d1Instruct70B, Llama3d2Instruct90B, CohereCommandRPlus) with specific configurations, and decorating the converse and converse_stream methods with llm_messages_print_decorator.
source/lambda/online/common_logic/common_utils/pydantic_models.py	12 added, 9 removed	The code changes introduce a new `QueryRewriteConfig` class that inherits from `LLMConfig` and adds a `rewrite_first_message` boolean field. The `QueryProcessConfig` class now uses `QueryRewriteConfig` instead of `LLMConfig` for the `conversation_query_rewrite_config` field. Additionally, logging statements are added to the `update_retrievers` method, and input validation is added for the `task_name` parameter.
source/lambda/online/common_logic/common_utils/prompt_utils.py	6 added, 8 removed	The code changes introduce a new prompt template called "CQR_SYSTEM_PROMPT" based on a paper, and register it for certain LLM models for the task type "CONVERSATION_SUMMARY_TYPE". Additionally, it removes some commented-out code related to an XML agent prompt.
source/lambda/online/common_logic/langchain_integration/chains/tool_calling_chain_api.py	42 added, 39 removed	The code changes involve refactoring and minor improvements to the Claude2ToolCallingChain class, which is responsible for binding tools to a language model and creating a chain for tool-calling. The changes include formatting adjustments, type annotations, and minor logic tweaks.
source/lambda/online/common_logic/common_utils/response_utils.py	39 added, 37 removed	The code changes include the following:

Added a new exception class WebsocketClientError.
Fixed indentation in the write_chat_history_to_ddb function.
Formatted the return dictionaries in api_response and stream_response functions.
Added type hints for function parameters.
Handled a potential KeyError by using event_body.get("stream", True) in process_response. |
| source/lambda/online/common_logic/langchain_integration/chains/retail_chains/retail_conversation_summary_chain.py | 28 added, 24 removed | This code defines several chains for summarizing retail conversations using different language models. It imports necessary modules, defines templates for prompts, and creates custom chains that inherit from base LLMChain classes. The chains are designed to process chat histories and generate summaries tailored for retail scenarios. |
| source/lambda/online/common_logic/langchain_integration/chains/rag_chain.py | 24 added, 24 removed | This code change introduces several new classes for different language models and RAG (Retrieval-Augmented Generation) chains, including Baichuan2Chat13B4BitsChatChain, Qwen2Instruct7BChatChain, GLM4Chat9BChatChain, and their respective RAG chain implementations. It also adds new Claude and Llama model variants for specific tasks like sonnet generation and instruction following. |
| source/lambda/online/common_logic/langchain_integration/chains/retail_chains/retail_tool_calling_chain_claude_xml.py | 79 added, 83 removed | The code changes appear to be related to formatting, imports, and modifications to the Claude2RetailToolCallingChain and Mixtral8x7bRetailToolCallingChain classes. Specifically:
Formatting changes to imports and string literals.
Modifications to the parse_function_calls_from_ai_message method in Mixtral8x7bRetailToolCallingChain.
Addition of chat_history_to_string and generate_chat_history methods in Mixtral8x7bRetailToolCallingChain.
Changes to default model parameters for Mixtral8x7bRetailToolCallingChain. |
| source/lambda/online/common_logic/langchain_integration/chains/tool_calling_chain_claude_xml.py | 59 added, 56 removed | The code changes include:
Minor formatting and import changes.
Updating the SYSTEM_MESSAGE_PROMPT string to use f-strings and improve readability.
Adding type hints to function parameters and return values.
Refactoring the format_fewshot_examples function for better code structure.
Adding docstrings to some functions for better documentation. |
| source/lambda/online/common_logic/langchain_integration/chains/intention_chain.py | 6 added, 4 removed | The code changes involve importing LLMModelType and LLMTaskType from common_logic.common_utils.constant, fixing a typo in the variable name exmaple_template, and formatting the string interpolation in create_few_shot_example_string to improve readability. |
| source/lambda/online/common_logic/langchain_integration/chains/conversation_summary_chain.py | 45 added, 43 removed | The code changes involve importing necessary modules, defining classes for conversation summary chains with different language models (e.g., InternLM2Chat, Claude, Cohere, Qwen2Instruct, GLM), and creating methods for generating prompts, formatting conversations, and creating message chains. |
| source/lambda/online/common_logic/common_utils/websocket_utils.py | 3 added, 2 removed | The code changes involve adding a blank line for readability, reformatting the boto3 client initialization for better code style, and removing an unnecessary blank line at the end of the file. |
| source/lambda/online/common_logic/langchain_integration/chains/retail_chains/retail_tool_calling_chain_json.py | 101 added, 104 removed | The code changes appear to be related to creating a tool-calling chain for a retail chatbot assistant. The main modifications include:
Importing a new chat chain class Qwen2Instruct7BChatChain.
Adding new classes GLM4Chat9BRetailToolCallingChain, Qwen2Instruct72BRetailToolCallingChain, Qwen2Instruct7BRetailToolCallingChain, and Qwen15Instruct32BRetailToolCallingChain for different language models.
Defining methods to convert OpenAI function tools to GLM format, format few-shot examples, create system prompts, and create chat histories.
Implementing logic to parse function calls from AI messages and create chains with specific model configurations. |

NingLu added 2 commits November 15, 2024 03:28

feat: support query rewrite for first sentence

ff5268b

Merge branch 'dev' into lvn

945806a

NingLu merged commit c3d2aa9 into dev Nov 15, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support query rewrite for first sentence #436

feat: support query rewrite for first sentence #436

NingLu commented Nov 15, 2024 •

edited by github-actions bot

Loading

feat: support query rewrite for first sentence #436

feat: support query rewrite for first sentence #436

Conversation

NingLu commented Nov 15, 2024 • edited by github-actions bot Loading

Description

Type of change

File Stats Summary

NingLu commented Nov 15, 2024 •

edited by github-actions bot

Loading