feat: support multimodal in docx #510

IcyKallen · 2025-01-14T10:02:54Z

Fixes #

🤖 AI-Generated PR Description (Powered by Amazon Bedrock)

Description

This pull request includes changes related to the knowledge base infrastructure, Lambda job dependencies, and language model integration. The main modifications are:

Updates to the knowledge base stack in the infrastructure layer.
Addition of a new Python package llm_bot_dep for the Lambda job, including new files for figure classification, Mermaid diagram generation, and loader updates.
Removal of the requirements.txt file and updates to the setup.py file in the dep folder.
Changes to the language model integration module in the Lambda online layer.

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

File Stats Summary

File number involved in this PR: 12, unfold to see the details:

The file changes summary is as follows:

Files	Changes	Change Summary
source/lambda/job/dep/requirements.txt	0 added, 20 removed	This file is removed in this PR
source/lambda/job/dep/MANIFEST.in	4 added, 0 removed	This change includes text and JSON data files from the llm_bot_dep directory and its subdirectories in the package distribution.
source/infrastructure/package.json	2 added, 2 removed	The code change modifies the "deploy" script to include the "--all" flag when running "npx cdk deploy", likely deploying all AWS CloudFormation stacks.
source/lambda/job/dep/setup.py	9 added, 2 removed	This code change updates the package dependencies, includes additional data files (.txt and .json) in the package distribution, and adds the Pillow library for image processing.
source/lambda/job/dep/llm_bot_dep/prompt/mermaid_template.txt	27 added, 0 removed	The code changes introduce a task to analyze a workflow diagram, extract objects and relationships, transform the extracted workflow into Mermaid chart code, and provide a detailed description structured with the workflow description and corresponding Mermaid code.
source/infrastructure/lib/knowledge-base/knowledge-base-stack.ts	1 added, 1 removed	The code change updates the version of the requests-aws4auth and boto3 Python packages and adds the pillow package to the list of additional Python modules installed in the Lambda function environment.
source/lambda/job/dep/llm_bot_dep/loaders/html.py	13 added, 5 removed	The code changes involve importing additional modules (base64, os, pathlib, BeautifulSoup), processing markdown images with an LLM, passing bucket and file names to the load method, and modifying the process_html function to handle a portal bucket name and file name.
source/lambda/online/common_logic/langchain_integration/chat_models/init.py	36 added, 27 removed	The code changes introduce improvements to the model creation and loading process, including dynamic module loading, more robust model ID parsing, and support for additional model providers and types.
source/lambda/job/dep/llm_bot_dep/prompt/figure_classification.txt	113 added, 0 removed	The code changes introduce an XML structure containing descriptions of various types of diagrams and images, including flowcharts, sequence diagrams, timelines, class diagrams, state diagrams, Gantt charts, entity-relationship diagrams, XY charts, pie charts, quadrant charts, and non-chart images.
source/lambda/job/dep/llm_bot_dep/loaders/docx.py	44 added, 9 removed	The code changes include importing additional modules (os, sys, pathlib, PIL), adding functionality to convert and save images from the Word document to a temporary directory, modifying the load method to accept bucket_name and file_name parameters, and updating the process_doc function to handle a portal_bucket_name parameter and call the load method with the appropriate arguments.
source/lambda/job/dep/llm_bot_dep/prompt/mermaid.json	1 added, 0 removed	This code change adds support for various types of diagrams and visualizations using the Mermaid syntax, including flowcharts, sequence diagrams, timelines, class diagrams, state diagrams, Gantt charts, entity relationship diagrams, xy charts, pie charts, and quadrant charts.
source/lambda/job/dep/llm_bot_dep/figure_llm.py	236 added, 0 removed	This code change adds functionality to process images in Markdown content using a language model (LLM). It can convert local images to base64 strings, upload them to an S3 bucket, and generate descriptions and visualizations (charts, diagrams) for the images using the LLM. The processed Markdown content includes the image descriptions and visualizations, with the original images replaced by their S3 links.

IcyKallen added 5 commits January 14, 2025 07:13

feat: support image retrieval from docx and html

b56d7c5

feat: update package for etl

02031e3

Merge remote-tracking branch 'origin/dev' into xuhan-dev

5af7630

Merge remote-tracking branch 'origin/dev' into xuhan-dev

812212d

fix: fix chat and optimize deploy

0462682

IcyKallen changed the title ~~feat: support multimodel in docx~~ feat: support multimodal in docx Jan 14, 2025

IcyKallen merged commit 98450fe into dev Jan 14, 2025
11 checks passed

IcyKallen deleted the xuhan-dev branch January 16, 2025 02:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support multimodal in docx #510

feat: support multimodal in docx #510

IcyKallen commented Jan 14, 2025 •

edited by github-actions bot

Loading

feat: support multimodal in docx #510

feat: support multimodal in docx #510

Conversation

IcyKallen commented Jan 14, 2025 • edited by github-actions bot Loading

Description

Type of change

File Stats Summary

IcyKallen commented Jan 14, 2025 •

edited by github-actions bot

Loading