Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: merge dev to main branch #63

Merged
merged 26 commits into from
Nov 7, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
c90da6f
Merge pull request #56 from aws-samples/main
yike5460 Oct 25, 2023
56ebc15
refactor: add lazy load method to save memory
yike5460 Oct 27, 2023
508f859
Merge branch 'dev' of https://github.com/aws-samples/llm-bot into dev
yike5460 Oct 27, 2023
24e50d7
feat: nougat loader and splitter class
yike5460 Oct 27, 2023
ea2680e
feat: add CSV loader to save the content in markdown format
NingLu Oct 29, 2023
45cb325
feat: judge the token before query enhancement
yike5460 Oct 30, 2023
d119f5d
chore: search funcion in aos & todo items
yike5460 Oct 30, 2023
1e3785d
chore: metadata template with parse logic
yike5460 Oct 31, 2023
28b995e
fix: add openai moduel
yike5460 Oct 31, 2023
4a2085c
Fix embedding model innference code
IcyKallen Nov 1, 2023
ea2aa28
feat: markdown split based on title & sub-title
yike5460 Nov 1, 2023
9cdcd34
Merge branch 'dev' of https://github.com/aws-samples/llm-bot into dev
yike5460 Nov 1, 2023
6062c28
chore: 1.update whl to dist directly; 2. add table split in splitter
yike5460 Nov 2, 2023
a00a7d9
feat: invoke csv load in glue script
NingLu Nov 2, 2023
95adbc8
chore: update glue job timeout and format the code
NingLu Nov 2, 2023
728cf21
feat: 1.update pdf process in glue; 2.update aos api schema and backe…
yike5460 Nov 2, 2023
0cab8fb
Merge branch 'dev' of https://github.com/aws-samples/llm-bot into dev
yike5460 Nov 2, 2023
9299aa3
fix: fix glue No space left on device issue
yike5460 Nov 3, 2023
3a4a2ea
chore: reorganize loader utils.
IcyKallen Nov 4, 2023
46eff33
chore: update new dep package
IcyKallen Nov 4, 2023
169b0d4
feat: remove blocker to allow pdf be normally processed in glue job
yike5460 Nov 5, 2023
694ae2c
Merge branch 'dev' of https://github.com/aws-samples/llm-bot into dev
yike5460 Nov 5, 2023
1cfc600
chore: update aos api function
yike5460 Nov 5, 2023
7027c51
chore: 1. add full file path in metadata; 2. adjust sfn timeout and e…
yike5460 Nov 6, 2023
1c0eef8
feat: add qa enhance along with para adjustment in glue
yike5460 Nov 6, 2023
ad8b435
chore: add retry to inject aos
yike5460 Nov 7, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
98 changes: 68 additions & 30 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,80 +44,85 @@

Use Postman/cURL to test the API connection, the API endpoint is the output of CloudFormation Stack with prefix 'embedding' or 'llm', the sample URL will be like "https://xxxx.execute-api.us-east-1.amazonaws.com/v1/embedding", the API request body is as follows:

**embedding uploaded file into AOS, POST https://xxxx.execute-api.us-east-1.amazonaws.com/v1/embedding, will be deprecate in the future**
**Offline process to pre-process file specificed in S3 bucket and prefix, POST https://xxxx.execute-api.us-east-1.amazonaws.com/v1/etl**

Check failure on line 47 in README.md

View workflow job for this annotation

GitHub Actions / miss spelling check for words or sentences

specificed ==> specified
```bash
BODY
{
"document_prefix": "<Your S3 bucket prefix>",
"aos_index": "chatbot-index"
"s3Bucket": "<Your S3 bucket>",
"s3Prefix": "<Your S3 prefix>",
"offline": "true"
}
```
You should see output like this:
```bash
{
"created": xx.xx,
"model": "embedding-endpoint"
}
"Step Function triggered, Step Function ARN: arn:aws:states:us-east-1:xxxx:execution:xx-xxx:xx-xx-xx-xx-xx, Input Payload: {\"s3Bucket\": \"<Your S3 bucket>\", \"s3Prefix\": \"<Your S3 prefix>\", \"offline\": \"true\"}"
```

**offline process to pre-process file specificed in S3 bucket and prefix, POST https://xxxx.execute-api.us-east-1.amazonaws.com/v1/etl**
**Embedding uploaded file into AOS, POST https://xxxx.execute-api.us-east-1.amazonaws.com/v1/embedding, will be deprecate in the future**
```bash
BODY
{
"s3Bucket": "<Your S3 bucket>",
"s3Prefix": "<Your S3 prefix>",
"offline": "true"
"document_prefix": "<Your S3 bucket prefix>",
"aos_index": "chatbot-index"
}
```
You should see output like this:
```bash
"Step Function triggered, Step Function ARN: arn:aws:states:us-east-1:xxxx:execution:xx-xxx:xx-xx-xx-xx-xx, Input Payload: {\"s3Bucket\": \"<Your S3 bucket>\", \"s3Prefix\": \"<Your S3 prefix>\", \"offline\": \"true\"}"
{
"created": xx.xx,
"model": "embedding-endpoint"
}
```

**query embeddings in AOS, POST https://xxxx.execute-api.us-east-1.amazonaws.com/v1/embedding**, other operation including index, delete, query are also provided for debugging purpose.
**Then you can query embeddings in AOS, POST https://xxxx.execute-api.us-east-1.amazonaws.com/v1/embedding**, other operation including index, delete, query are also provided for debugging purpose.
```bash
BODY
{
"aos_index": "chatbot-index",
"query": {
"operation": "match_all",
"match_all": {}
}
"operation": "match_all",
"body": ""
}
```

You should see output like this:
```bash
{
"took": 17,
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"total": 4,
"successful": 4,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 890,
"value": 256,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "chatbot-index",
"_id": "038592b1-8bd0-4415-9e18-93d632afa52f",
"_id": "035e8439-c683-4278-97f3-151f8cd4cdb6",
"_score": 1.0,
"_source": {
"vector_field": [
0.005092620849609375,
xx
-0.03106689453125,
-0.00798797607421875,
...
],
"text": "cess posterior mean. However, we can expand\nEq. (8) further by reparameterizing Eq. (4) as xt(x0, (cid:15)) = √¯αtx0 + √1\n(0, I) and\napplying the forward process posterior formula (7):\n¯αt(cid:15) for (cid:15)\n∼ N\n−\n(cid:34)\n(cid:34)\nLt\n1 −\n−\nC = Ex0,(cid:15)\n= Ex0,(cid:15)\n1\n2σ2\nt\n(cid:18)\n(cid:13)\n(cid:13)\n˜µt\n(cid:13)\n(cid:13)\nxt(x0, (cid:15)),\n1\n√¯αt\n(xt(x0, (cid:15))\n√1\n−\n−\n¯αt(cid:15))\n(cid:19)\n−\n(cid:13)\n(cid:13)\nµθ(xt(x0, (cid:15)), t)\n(cid:13)\n(cid:13)\n2(cid:35)\n1\n2σ2\nt\n(cid:13)\n(cid:13)\n(cid:13)\n(cid:13)\n1\n√αt\n(cid:18)\nxt(x0, (cid:15))\nβt\n−\n√1\n¯αt\n−\n(cid:19)\n(cid:15)\n−\nµθ(xt(x0, (cid:15)), t)\n2(cid:35)\n(cid:13)\n(cid:13)\n(cid:13)\n(cid:13)\n(9)\n(10)\n3\nAlgorithm 1 Training\nAlgorithm 2 Sampling\n1: repeat\n2: x0 ∼ q(x0)\n3:\n4:\n5: Take gradient descent step on\n√\n(cid:13)\n(cid:13)(cid:15) − (cid:15)θ(\nt ∼ Uniform({1, . . . , T })\n(cid:15) ∼ N (0, I)\n¯αtx0 +\n∇θ\n6: until converged\n√\n1 − ¯αt(cid:15), t)(cid:13)\n2\n(cid:13)\n1: xT ∼ N (0, I)\n2: for t = T, . . . , 1 do\n3: z ∼ N (0, I) if t > ",
"text": "## 1 Introduction\n\nDeep generative models of all kinds have recently exhibited high quality samples in a wide variety of data modalities. Generative adversarial networks (GANs), autoregressive models, flows, and variational autoencoders (VAEs) have synthesized striking image and audio samples [14; 27; 3; 58; 38; 25; 10; 32; 44; 57; 26; 33; 45], and there have been remarkable advances in energy-based modeling and score matching that have produced images comparable to those of GANs [11; 55].",
"metadata": {
"source": "unknown",
"fontsize": 11,
"heading": "3 Diffusion models and denoising autoencoders\n",
"fontsize_idx": 2
"content_type": "paragraph",
"heading_hierarchy": {
"1 Introduction": {}
},
"figure_list": [],
"chunk_id": "$2",
"file_path": "Denoising Diffusion Probabilistic Models.pdf",
"keywords": [],
"summary": ""
}
}
},
Expand All @@ -127,6 +132,39 @@
}
```

**Delete intial index in AOS, POST https://xxxx.execute-api.us-east-1.amazonaws.com/v1/embedding for debugging purpose**

Check failure on line 135 in README.md

View workflow job for this annotation

GitHub Actions / miss spelling check for words or sentences

intial ==> initial
```bash
{
"aos_index": "chatbot-index",
"operation": "delete",
"body": ""
}
```

**Create intial index in AOS, POST https://xxxx.execute-api.us-east-1.amazonaws.com/v1/embedding for debugging purpose**

Check failure on line 144 in README.md

View workflow job for this annotation

GitHub Actions / miss spelling check for words or sentences

intial ==> initial
```bash
{
"aos_index": "chatbot-index",
"operation": "create",
"body": {
"settings": {
"index": {
"number_of_shards": 2,
"number_of_replicas": 1
}
},
"mappings": {
"properties": {
"vector_field": {
"type": "knn_vector",
"dimension": 1024
}
}
}
}
}
```

**invoke LLM with context, POST https://xxxx.execute-api.us-east-1.amazonaws.com/v1/llm**
```bash
BODY
Expand Down Expand Up @@ -168,7 +206,7 @@
]
}
```
5. Launch dashboard to check and debug the ETL & QA process
1. Launch dashboard to check and debug the ETL & QA process

```bash
cd /src/panel
Expand Down
76 changes: 47 additions & 29 deletions src/etl-stack.ts
Original file line number Diff line number Diff line change
Expand Up @@ -36,58 +36,73 @@ export class EtlStack extends NestedStack {
type: glue.ConnectionType.NETWORK,
subnet: props._subnets[0],
securityGroups: [props._securityGroups],
});
});

const _S3Bucket = new s3.Bucket(this, 'llm-bot-glue-lib', {
bucketName: `llm-bot-glue-lib-${Aws.ACCOUNT_ID}-${Aws.REGION}`,
blockPublicAccess: s3.BlockPublicAccess.BLOCK_ALL,
});

const extraPythonFiles = new s3deploy.BucketDeployment(this, 'extraPythonFiles', {
sources: [s3deploy.Source.asset('src/scripts/whl')],
sources: [s3deploy.Source.asset('src/scripts/dep/dist')],
destinationBucket: _S3Bucket,
// destinationKeyPrefix: 'llm_bot_dep-0.1.0-py3-none-any.whl',
});

// Creata glue job to process files speicified in s3 bucket and prefix
const glueJob = new glue.Job(this, 'PythonShellJob', {
executable: glue.JobExecutable.pythonShell({
glueVersion: glue.GlueVersion.V1_0,
pythonVersion: glue.PythonVersion.THREE_NINE,
script: glue.Code.fromAsset(path.join(__dirname, 'scripts/glue-job-script.py')),
// s3 location of the python script
// extraPythonFiles: [glue.Code.fromAsset(path.join(__dirname, 'scripts/llm_bot_dep-0.1.0-py3-none-any.whl'))],
// extraPythonFiles: [extraPythonFiles],
}),
maxConcurrentRuns:200,
maxRetries:3,
connections:[connection],
maxCapacity:1,
defaultArguments:{
'--S3_BUCKET.$': sfn.JsonPath.stringAt('$.s3Bucket'),
'--S3_PREFIX.$': sfn.JsonPath.stringAt('$.s3Prefix'),
'--AOS_ENDPOINT': props._domainEndpoint,
'--REGION': props._region,
'--EMBEDDING_MODEL_ENDPOINT': props._embeddingEndpoint,
'--DOC_INDEX_TABLE': 'chatbot-index',
'--additional-python-modules': 'pdfminer.six==20221105,gremlinpython==3.7.0,langchain==0.0.312,beautifulsoup4==4.12.2,requests-aws4auth==1.2.3,boto3==1.28.69,nougat==0.3.3',
'--extra-py-files': _S3Bucket.s3UrlForObject('llm_bot_dep-0.1.0-py3-none-any.whl'),
}
});
// Assemble the extra python files list using _S3Bucket.s3UrlForObject('llm_bot_dep-0.1.0-py3-none-any.whl') and _S3Bucket.s3UrlForObject('nougat_ocr-0.1.17-py3-none-any.whl') and convert to string
const extraPythonFilesList = [_S3Bucket.s3UrlForObject('llm_bot_dep-0.1.0-py3-none-any.whl')].join(',');

glueJob.role.addToPrincipalPolicy(
const glueRole = new iam.Role(this, 'ETLGlueJobRole', {
assumedBy: new iam.ServicePrincipal('glue.amazonaws.com'),
// the role is used by the glue job to access AOS and by default it has 1 hour session duration which is not enough for the glue job to finish the embedding injection
maxSessionDuration: Duration.hours(12),
});
glueRole.addToPrincipalPolicy(
new iam.PolicyStatement({
actions: [
"sagemaker:InvokeEndpointAsync",
"sagemaker:InvokeEndpoint",
"s3:*",
"es:*",
"glue:*",
"ec2:*",
// cloudwatch logs
"logs:*",
],
effect: iam.Effect.ALLOW,
resources: ['*'],
})
)

// Creata glue job to process files speicified in s3 bucket and prefix
const glueJob = new glue.Job(this, 'PythonShellJob', {
executable: glue.JobExecutable.pythonShell({
glueVersion: glue.GlueVersion.V3_0,
pythonVersion: glue.PythonVersion.THREE_NINE,
script: glue.Code.fromAsset(path.join(__dirname, 'scripts/glue-job-script.py')),
}),
// Worker Type is not supported for Job Command pythonshell and Both workerType and workerCount must be set...
// workerType: glue.WorkerType.G_2X,
// workerCount: 2,
maxConcurrentRuns: 200,
maxRetries: 1,
connections: [connection],
maxCapacity: 1,
role: glueRole,
defaultArguments: {
'--S3_BUCKET.$': sfn.JsonPath.stringAt('$.s3Bucket'),
'--S3_PREFIX.$': sfn.JsonPath.stringAt('$.s3Prefix'),
'--QA_ENHANCEMENT.$': sfn.JsonPath.stringAt('$.qaEnhance'),
'--AOS_ENDPOINT': props._domainEndpoint,
'--REGION': props._region,
'--EMBEDDING_MODEL_ENDPOINT': props._embeddingEndpoint,
'--DOC_INDEX_TABLE': 'chatbot-index',
'--additional-python-modules': 'langchain==0.0.312,beautifulsoup4==4.12.2,requests-aws4auth==1.2.3,boto3==1.28.69,openai==0.28.1,nougat-ocr==0.1.17,pyOpenSSL==23.3.0,tenacity==8.2.3',
// add multiple extra python files
'--extra-py-files': extraPythonFilesList
}
});

// Create SNS topic and subscription to notify when glue job is completed
const topic = new sns.Topic(this, 'etl-topic', {
displayName: 'etl-topic',
Expand All @@ -111,6 +126,7 @@ export class EtlStack extends NestedStack {
'--EMBEDDING_MODEL_ENDPOINT': props._embeddingEndpoint,
'--REGION': props._region,
'--OFFLINE': 'true',
'--QA_ENHANCEMENT.$': '$.qaEnhance',
}),
});

Expand All @@ -127,6 +143,7 @@ export class EtlStack extends NestedStack {
'--EMBEDDING_MODEL_ENDPOINT': props._embeddingEndpoint,
'--REGION': props._region,
'--OFFLINE': 'false',
'--QA_ENHANCEMENT.$': '$.qaEnhance',
}),
});

Expand All @@ -145,7 +162,8 @@ export class EtlStack extends NestedStack {
const sfnStateMachine = new sfn.StateMachine(this, 'ETLState', {
definitionBody: sfn.DefinitionBody.fromChainable(sfnDefinition),
stateMachineType: sfn.StateMachineType.STANDARD,
timeout: Duration.minutes(30),
// Align with the glue job timeout
timeout: Duration.minutes(2880),
});

// Export the Step function to be used in API Gateway
Expand Down
29 changes: 20 additions & 9 deletions src/lambda/embedding/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,19 +124,30 @@ def lambda_handler(event, context):

# parse arguments from event
index_name = json.loads(event['body'])['aos_index']

operation = json.loads(event['body'])['operation']
body = json.loads(event['body'])['body']
aos_client = OpenSearchClient(_opensearch_cluster_domain)
# re-route GET request to seperate processing branch
if event['httpMethod'] == 'GET':
query = json.loads(event['body'])['query']
aos_client = OpenSearchClient(_opensearch_cluster_domain)
# check if the operation is query of search for OpenSearch
if query['operation'] == 'query':
response = aos_client.query(index_name, query['field'], query['value'])
elif query['operation'] == 'match_all':
if operation == 'query':
response = aos_client.query(index_name, json.dumps(body))
elif operation == 'match_all':
response = aos_client.match_all(index_name)
else:
raise Exception(f'Invalid query operation: {query["operation"]}')

raise Exception(f'Invalid query operation: {operation}')
return {
'statusCode': 200,
'headers': {'Content-Type': 'application/json'},
'body': json.dumps(response)
}
elif event['httpMethod'] == 'POST':
if operation == 'delete':
response = aos_client.delete_index(index_name)
elif operation == 'create':
logger.info(f'create index with query: {json.dumps(body)}')
response = aos_client.create_index(index_name, json.dumps(body))
else:
raise Exception(f'Invalid query operation: {operation}')
return {
'statusCode': 200,
'headers': {'Content-Type': 'application/json'},
Expand Down
Loading
Loading