This repository contains the code and resources for building a machine learning workflow for Scones Unlimited using Amazon SageMaker. The project demonstrates how to construct a robust, end-to-end machine learning pipeline on AWS, from data preprocessing to model deployment.
Image classifiers are crucial in computer vision, identifying the content of images across various industries, including autonomous vehicles, augmented reality, eCommerce, and diagnostic medicine.
As a Machine Learning Engineer at Scones Unlimited, a scone-delivery-focused logistics company, you are tasked with building and deploying an image classification model. This model will optimize operations by identifying the type of vehicle delivery drivers use, such as bicycles or motorcycles, and routing them accordingly. Assigning bicycle deliveries to nearby locations and motorcycle deliveries to farther locations can significantly enhance operational efficiency.
The goal is to create a scalable and safe model that other teams can use on-demand. The model must scale to meet demand, and safeguards should be in place to monitor and control for performance degradation or drift. This project involves using AWS SageMaker to build the model, deploying it, integrating AWS Lambda functions to create supporting services, and using AWS Step Functions to compose these components into an event-driven application. By the end of this project, you will have created a portfolio-ready demo that showcases your ability to build and compose scalable, ML-enabled AWS applications.
- Data Staging: Prepare and stage the data for model training.
- Model Training and Deployment: Train the image classification model using SageMaker and deploy it.
- Lambdas and Step Function Workflow: Develop AWS Lambda functions and integrate them with AWS Step Functions to create a cohesive workflow.
- Testing and Evaluation: Test the entire pipeline and evaluate the model's performance.
- Optional Challenge: Explore additional features or improvements as a challenge.
- Cleanup Cloud Resources: Properly clean up all AWS resources to avoid unnecessary charges.
To get started with this project, clone the repository to your local machine:
git clone https://github.com/aniketjain12/Build-a-ML-Workflow-For-Scones-Unlimited-On-Amazon-SageMaker-.git
cd Build-a-ML-Workflow-For-Scones-Unlimited-On-Amazon-SageMaker-
Before running the project, ensure you have the following installed:
- AWS CLI
- AWS Account with access to SageMaker
- Python 3.7+
- Jupyter Notebook or JupyterLab
-
Run the notebooks:
Navigate to the
notebooks
directory and run the Jupyter notebooks to explore the data, train the model, and evaluate the results. -
Deploy the model:
Use the provided scripts in the
sagemaker_pipelines
directory to deploy the trained model to Amazon SageMaker. -
Execute the pipeline:
Run the complete machine learning pipeline using the provided SageMaker pipeline scripts.
The pipeline includes the following stages:
- Data Preprocessing: Data cleaning and feature engineering using AWS Glue and Amazon S3.
- Model Training: Model training using Amazon SageMaker.
- Model Evaluation: Model evaluation using SageMaker Processing.
- Model Deployment: Deploying the model to an endpoint using SageMaker.
- Monitoring and Logging: Continuous monitoring of the deployed model.
- Model Accuracy: 96%
- Endpoint Cost: $0.119 per hour
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.