Skip to content

Commit

Permalink
Merge pull request #99 from aws-samples/dev
Browse files Browse the repository at this point in the history
feat: sync from dev to main branch
  • Loading branch information
yike5460 authored Jan 15, 2024
2 parents f224be8 + 8bfaad7 commit aff033a
Show file tree
Hide file tree
Showing 86 changed files with 5,524 additions and 2,244 deletions.
1 change: 1 addition & 0 deletions .gitignore

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

44 changes: 29 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,10 @@ Make sure Python installed properly. Usage: ./prepare_model.sh -s S3_BUCKET_NAME
./prepare_model.sh -s <Your S3 Bucket Name>

cd source/model/etl/code
sh model.sh ./Dockerfile <EtlImageName> <AWS_REGION>
# Use ./Dockerfile for standard regions, use ./DockerfileCN for GCR(Greater China) region
sh model.sh <./Dockerfile>|<./DockerfileCN> <EtlImageName> <AWS_REGION> <EtlImageTag>
```
The ETL image will be pushed to your ECR repo with the image name you specified when executing the command sh model.sh ./Dockerfile <EtlImageName> <AWS_REGION>, AWS_REGION is like us-east-1, us-west-2, etc.
The ETL image will be pushed to your ECR repo with the image name you specified when executing the command sh model.sh ./Dockerfile <EtlImageName> <AWS_REGION>, AWS_REGION is like us-east-1, us-west-2, etc. Dockerfile is for deployment in standard regions. DockerfileCN is for GCR(Greater China) region. For example, to deploy it in GCR region, the command is: sh model.sh ./DockerfileCN llm-bot-cn cn-northwest-1 latest


2. Deploy CDK template (add sudo if you are using Linux), make sure DOCKER is installed properly
Expand All @@ -37,7 +38,7 @@ npx cdk deploy
cd source/infrastructure
aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws
npm install
npx cdk deploy --rollback false --parameters S3ModelAssets=<Your S3 Bucket Name> --parameters SubEmail=<Your email address> --parameters OpenSearchIndex=<Your OpenSearch Index Name> --parameters EtlImageName=<Your ETL model name>
npx cdk deploy --rollback false --parameters S3ModelAssets=<Your S3 Bucket Name> --parameters SubEmail=<Your email address> --parameters OpenSearchIndex=<Your OpenSearch Index Name> --parameters EtlImageName=<Your ETL model name> --parameters ETLTag=<Your ETL tag name>
```

**Deployment parameters**
Expand All @@ -48,6 +49,7 @@ npx cdk deploy --rollback false --parameters S3ModelAssets=<Your S3 Bucket Name>
| SubEmail | Your email address to receive notifications |
| OpenSearchIndex | OpenSearch index name to store the knowledge, if the index is not existed, the solution will create one |
| EtlImageName | ETL image name, eg. etl-model, it is set when you executing source/model/etl/code/model.sh script |
| EtlTag | ETL tag, eg. latest, v1.0, v2.0, the default value is latest, it is set when you executing source/model/etl/code/model.sh script |

You can update us-east-1 to any other available region according to your need. You will get output similar like below:
```
Expand All @@ -67,27 +69,39 @@ arn:aws:cloudformation:us-east-1:<Your account id>:stack/llm-bot-dev/xx

3. Test the API connection

Use Postman/cURL to test the API connection, the API endpoint is the output of CloudFormation Stack with prefix 'embedding' or 'llm', the sample URL will be like "https://xxxx.execute-api.us-east-1.amazonaws.com/v1/embedding", the API request body is as follows:
Use Postman/cURL to test the API connection, the API endpoint is the output of CloudFormation Stack with methods 'extract', 'etl' or 'llm', the sample URL will be like "https://xxxx.execute-api.us-east-1.amazonaws.com/v1/<method>", the API request body is as follows:

**Offline process to pre-process file specified in S3 bucket and prefix, POST https://xxxx.execute-api.us-east-1.amazonaws.com/v1/etl**
**Quick Start Tuturial**

**Extract document from specified S3 bucket and prefix, POST https://xxxx.execute-api.us-east-1.amazonaws.com/v1/extract, use flag need_split to configure if extracted document need to be splitted semantically or keep with original content**
```bash
BODY
{
"s3Bucket": "<Your S3 bucket>", eg. "llm-bot-resource"
"s3Prefix": "<Your S3 prefix>", eg. "input_samples/"
"need_split": true
}
```

**Offline (asychronous) process to batch processing documents specified in S3 bucket and prefix, such process include extracting, splitting document content, converting to vector representation and injecting into Amazon Open Search (AOS). POST https://xxxx.execute-api.us-east-1.amazonaws.com/v1/etl**
```bash
BODY
{
"s3Bucket": "<Your S3 bucket>", eg. "llm-bot-resource"
"s3Prefix": "<Your S3 prefix>", eg. "input_samples/"
"offline": "true",
"qaEnhance": "false"
"qaEnhance": "false",
"aosIndex": "<Your OpenSearch index>", eg. "dev"
}

```


You should see output like this:
You should see similar outputs like this:
```bash
"Step Function triggered, Step Function ARN: arn:aws:states:us-east-1:xxxx:execution:xx-xxx:xx-xx-xx-xx-xx, Input Payload: {\"s3Bucket\": \"<Your S3 bucket>\", \"s3Prefix\": \"<Your S3 prefix>\", \"offline\": \"true\"}"
```

**Then you can query embeddings in AOS, POST https://xxxx.execute-api.us-east-1.amazonaws.com/v1/aos**, other operation including index, delete, query are also provided for debugging purpose.
**You can query the embeddings injected into AOS after the ETL process complete, note the execution time largly depend on file size and number, and the estimate time is around 3~5 minutes per documents. POST https://xxxx.execute-api.us-east-1.amazonaws.com/v1/aos**, other operation including index, delete, query are also provided for debugging purpose.
```bash
BODY
{
Expand Down Expand Up @@ -145,7 +159,7 @@ You should see output like this:
}
```

**Delete initial index in AOS, POST https://xxxx.execute-api.us-east-1.amazonaws.com/v1/aos for debugging purpose**
**Delete initial index in AOS if you want to setup your customized one instead of built-in index, POST https://xxxx.execute-api.us-east-1.amazonaws.com/v1/aos for debugging purpose**
```bash
{
"aos_index": "chatbot-index",
Expand All @@ -161,7 +175,7 @@ You should see output like this:
}
```

**Create other index in AOS, POST https://xxxx.execute-api.us-east-1.amazonaws.com/v1/aos for debugging purpose, note the index "chatbot-index" will create by default to use directly**
**Create new index in AOS, POST https://xxxx.execute-api.us-east-1.amazonaws.com/v1/aos for debugging purpose, note the index "chatbot-index" will create by default to use directly**
```bash
{
"aos_index": "llm-bot-index",
Expand All @@ -179,7 +193,7 @@ You should see output like this:
}
```

**Online process to embedding & inject document directly, POST https://xxxx.execute-api.us-east-1.amazonaws.com/v1/aos**
**Ad-hoc to embedding & inject document directly instead of full ETL process, POST https://xxxx.execute-api.us-east-1.amazonaws.com/v1/aos**
```bash
BODY
{
Expand Down Expand Up @@ -277,7 +291,7 @@ You should see output like this, the metadata.file_path field is matched with th
}
}
```
**Query the embedding with KNN, GET https://xxxx.execute-api.us-east-1.amazonaws.com/v1/aos**
**Query the embedding with KNN, note the floating number are vector representation for such query, e.g. "How are you?", GET https://xxxx.execute-api.us-east-1.amazonaws.com/v1/aos**
```bash
{
"aos_index": "llm-bot-index",
Expand Down Expand Up @@ -461,7 +475,7 @@ You should see output like this, the index mapping configuration is returned:
There are other operations including 'bulk', 'delete_index', 'delete_document' etc. for debugging purpose, the sample body will be update soon. User will not need to use proxy instance to access the AOS inside VPC, the API gateway with Lambda proxy integration are wrapped to access the AOS directly.


4. [Optional] Launch dashboard to check and debug the ETL & QA process
1. [Optional] Launch dashboard to check and debug the ETL & QA process

```bash
cd /source/panel
Expand All @@ -472,7 +486,7 @@ python -m streamlit run app.py --server.runOnSave true --server.port 8088 --brow
```
login with IP/localhost:8088, you should see the dashboard to operate.

5. [Optional] Upload embedding file to S3 bucket created in the previous step, the format is like below:
2. [Optional] Upload embedding file to S3 bucket created in the previous step, the format is like below:
```bash
aws s3 cp <Your documents> s3://llm-bot-documents-<Your account id>-<region>/<Your S3 bucket prefix>/
```
Expand Down
36 changes: 34 additions & 2 deletions source/infrastructure/bin/main.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,14 @@ import { App, CfnOutput, CfnParameter, Stack, StackProps } from 'aws-cdk-lib';
import { Construct } from 'constructs';
import * as dotenv from "dotenv";
import { LLMApiStack } from '../lib/api/api-stack';
import { DynamoDBStack } from '../lib/ddb-stack';
import { DynamoDBStack } from '../lib/ddb/ddb-stack';
import { EtlStack } from '../lib/etl/etl-stack';
import { AssetsStack } from '../lib/model/assets-stack';
import { LLMStack } from '../lib/model/llm-stack';
import { VpcStack } from '../lib/shared/vpc-stack';
import { OpenSearchStack } from '../lib/vector-store/os-stack';
import { ConnectorStack } from '../lib/connector/connector-stack';

dotenv.config();

export class RootStack extends Stack {
Expand All @@ -33,12 +35,19 @@ export class RootStack extends Stack {
default: 'chatbot-index',
});

const _etlTag = new CfnParameter(this, 'ETLTag', {
type: 'String',
description: 'ETL image tag, the default is latest',
default: 'latest',
});

let _OpenSearchIndexDictDefaultValue: string|undefined;


if (process.env.AOSDictValue !== undefined) {
_OpenSearchIndexDictDefaultValue = process.env.AOSDictValue
} else {
_OpenSearchIndexDictDefaultValue = '{"aos_index_mkt_qd":"aws-cn-mkt-knowledge","aos_index_mkt_qq":"gcr-mkt-qq","aos_index_dgr_qd":"ug-index-3","aos_index_dgr_qq":"faq-index-2"}';
_OpenSearchIndexDictDefaultValue = '{"aos_index_mkt_qd":"aws-cn-mkt-knowledge","aos_index_mkt_qq":"gcr-mkt-qq","aos_index_dgr_qd":"ug-index-20240108","aos_index_dgr_qq":"gcr-dgr-qq", "aos_index_dgr_faq_qd":"faq-index-20240110", "dummpy_key":"dummpy_value"}';
}

const _OpenSearchIndexDict = new CfnParameter(this, 'OpenSearchIndexDict', {
Expand Down Expand Up @@ -92,11 +101,25 @@ export class RootStack extends Stack {
_s3ModelAssets:_S3ModelAssets.valueAsString,
_OpenSearchIndex: _OpenSearchIndex.valueAsString,
_imageName: _imageName.valueAsString,
_etlTag: _etlTag.valueAsString,
});
_EtlStack.addDependency(_VpcStack);
_EtlStack.addDependency(_OsStack);
_EtlStack.addDependency(_LLMStack);

const _ConnectorStack = new ConnectorStack(this, 'connector-stack', {
_vpc:_VpcStack._vpc,
_securityGroup:_VpcStack._securityGroup,
_domainEndpoint:_OsStack._domainEndpoint,
_embeddingEndPoints:_LLMStack._embeddingEndPoints || '',
_OpenSearchIndex: _OpenSearchIndex.valueAsString,
_OpenSearchIndexDict: _OpenSearchIndexDict.valueAsString,
env:process.env
});
_ConnectorStack.addDependency(_VpcStack);
_ConnectorStack.addDependency(_OsStack);
_ConnectorStack.addDependency(_LLMStack);

const _ApiStack = new LLMApiStack(this, 'api-stack', {
_vpc:_VpcStack._vpc,
_securityGroup:_VpcStack._securityGroup,
Expand All @@ -108,19 +131,28 @@ export class RootStack extends Stack {
_sfnOutput: _EtlStack._sfnOutput,
_OpenSearchIndex: _OpenSearchIndex.valueAsString,
_OpenSearchIndexDict: _OpenSearchIndexDict.valueAsString,
_jobName: _ConnectorStack._jobName,
_jobQueueArn: _ConnectorStack._jobQueueArn,
_jobDefinitionArn: _ConnectorStack._jobDefinitionArn,
_etlEndpoint: _EtlStack._etlEndpoint,
_resBucketName: _EtlStack._resBucketName,
env:process.env
});
_ApiStack.addDependency(_VpcStack);
_ApiStack.addDependency(_OsStack);
_ApiStack.addDependency(_LLMStack);
_ApiStack.addDependency(_DynamoDBStack);
_ApiStack.addDependency(_ConnectorStack);
_ApiStack.addDependency(_DynamoDBStack);
_ApiStack.addDependency(_EtlStack);

new CfnOutput(this, 'VPC', {value:_VpcStack._vpc.vpcId});
new CfnOutput(this, 'OpenSearch Endpoint', {value:_OsStack._domainEndpoint});
new CfnOutput(this, 'Document Bucket', {value:_ApiStack._documentBucket});
// deprecate for now since proxy in ec2 instance is not allowed according to policy
// new CfnOutput(this, 'OpenSearch Dashboard', {value:`${_Ec2Stack._publicIP}:8081/_dashboards`});
new CfnOutput(this, 'API Endpoint Address', {value:_ApiStack._apiEndpoint});
new CfnOutput(this, 'WebSocket Endpoint Address', {value:_ApiStack._wsEndpoint});
new CfnOutput(this, 'Glue Job Name', {value:_EtlStack._jobName});
new CfnOutput(this, 'Cross Model Endpoint', {value:_LLMStack._rerankEndPoint || 'No Cross Endpoint Created'});
new CfnOutput(this, 'Embedding Model Endpoint', {value:_LLMStack._embeddingEndPoints[0] || 'No Embedding Endpoint Created'});
Expand Down
61 changes: 61 additions & 0 deletions source/infrastructure/lib/api/api-queue.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
import { Duration, StackProps } from 'aws-cdk-lib';
import * as iam from 'aws-cdk-lib/aws-iam';
import * as sqs from 'aws-cdk-lib/aws-sqs';
import { Construct } from 'constructs';


export class ApiQueueStack extends Construct {

public readonly sqsStatement: iam.PolicyStatement;
public readonly messageQueue: sqs.Queue;
public readonly dlq: sqs.Queue;

constructor(scope: Construct, id: string) {
super(scope, id);

const dlq = new sqs.Queue(this, 'LLMApiDLQ', {
encryption: sqs.QueueEncryption.KMS_MANAGED,
retentionPeriod: Duration.days(14),
visibilityTimeout: Duration.hours(10),
});

const messageQueue = new sqs.Queue(this, 'LLMApiQueue', {
encryption: sqs.QueueEncryption.KMS_MANAGED,
visibilityTimeout: Duration.hours(3),
deadLetterQueue: {
queue: dlq,
maxReceiveCount: 50,
},
});
messageQueue.addToResourcePolicy(
new iam.PolicyStatement({
effect: iam.Effect.DENY,
principals: [new iam.AnyPrincipal()],
actions: ["sqs:*"],
resources: ["*"],
conditions: {
Bool: { "aws:SecureTransport": "false" }
}
})
);

const sqsStatement = new iam.PolicyStatement({
effect: iam.Effect.ALLOW,
resources: [messageQueue.queueArn],
actions: [
"sqs:DeleteMessage",
"sqs:GetQueueUrl",
"sqs:ChangeMessageVisibility",
"sqs:PurgeQueue",
"sqs:ReceiveMessage",
"sqs:SendMessage",
"sqs:GetQueueAttributes",
"sqs:SetQueueAttributes",
],
});

this.sqsStatement = sqsStatement;
this.messageQueue = messageQueue;
this.dlq = dlq;
}
}
Loading

0 comments on commit aff033a

Please sign in to comment.