Skip to content

Latest commit

 

History

History
37 lines (26 loc) · 2.29 KB

README.md

File metadata and controls

37 lines (26 loc) · 2.29 KB

How to reduce costs and improve performance of your Machine Learning (ML) workloads?

In this repo you'll learn how to use AWS Trainium and AWS Inferentia with Amazon SageMaker and Hugging Face Optimum Neuron, to optimize your ML workloads! Here you find workshops, tutorials, blog post content, etc. you can use to learn and inspire your own solution.

The content you find here is focused on particular use cases. If you're looking for standalone model samples for inference and training, please check this other repo: https://github.com/aws-neuron/aws-neuron-samples.

Workshops

Title
Fine-tune and deploy LLM from Hugging Face on AWS Trainium and AWS Inferentia Learn how to create a spam classifier that can be easily integrated to your own application
Adapting LLMs for domain-aware applications with AWS Trainium post-training Learn how to adapt a pre-trained model to your own business needs and add a conversational interface your customers can interact with

These workshops are supported by AWS Workshop Studio

Tutorials

Description
inf1 - Extract embeddings from raw text
inf1 - Track objects in video streaming using CV
inf1 - Create a closed question Q&A model
ind2 - Generate images using SD
inf1 - Answer questions given a context
trn1 - Fine-tune a LLM using distributed training
inf2 - Deploy a LLM to HF TGI

Blog posts content

Description
Llama3-8B Deployment on AWS Inferentia 2 with Amazon EKS and vLLM

Contributing

If you have questions, comments, suggestions, etc. please feel free to cut tickets in this repo.

Also, please refer to the CONTRIBUTING document for further details on contributing to this repository.