JARVIS

This repository contains the materials and code for the JARVIS project, a personal assistant inspired by Marvel’s J.A.R.V.I.S. Over the course of the semester, we will integrate voice recognition (speech-to-text), text-to-speech (TTS), natural language processing (NLP), and various task automation features.

The project is split into two phases:

API-Powered Assistant (Weeks 1–4): Utilize existing APIs (OpenAI, Whisper, etc.) to quickly develop JARVIS’ core functionality: speech recognition, TTS, and dynamic response generation.
Offline Assistant (Weeks 5–10): Replace cloud APIs with locally hosted models (using tools like Ollama and Hugging Face) to make JARVIS fully offline, handling everything from speech recognition and TTS to NLP on your own machine.

Skills Learned

Voice Recognition and TTS
Natural Language Processing (NLP)
Offline Model Hosting (e.g., Ollama, Hugging Face)
Working with Python libraries, asynchronous code, and APIs
Automated Task Handling / Command System

Weekly Slides

Slides for supplementary learning can be found here (UMich login required).

Requirements

Solid Python Skills: You must be very comfortable with Python (object-oriented concepts, asynchronous programming, etc.).
Operating System: macOS 11 or later, Windows 10 or later, or any modern Linux distribution.
API Familiarity: You should have a strong understanding of how APIs work and how to integrate them.
(Optional) ML Experience: Helpful but not required. We will cover essential ML topics as needed.

Project Timeline

Week	Date	Topic	Objective
1	1/26	Introduction, Setup, Voice Input + TTS
2	2/2	Basic Command Handling System with LangChain
3	2/9	OpenAI API for Dynamic Response Generation
4	2/16	Ollama for Local Hosting
5	2/23	Hugging Face Crash Course	Project Checkpoint
-	-	Spring Break
-	-	Spring Break
6	3/16	Offline NLP Pipeline
7	3/23	Integrating Offline Speech Recognition, TTS, NLP
8	3/30	Development Time
9	4/6	Development Time
10	4/13	Final Expo Prep	Final Deliverable Due
-	4/19	Final Project Exposition 🎉	Presentation Due

Note: Weeks 1–4 focus on creating JARVIS with cloud APIs. Weeks 5–10 focus on transitioning to an offline solution.

Getting Started

If your local environment proves challenging, utilize cloud notebooks like Google Colab or Kaggle.

1. Clone the Repository

 git clone https://github.com/MichiganDataScienceTeam/W25-JARVIS.git
 cd W25-JARVIS

2. Set Up Your Environment

We recommend using a virtual environment (requires Python 3.9 or later):

 python3 -m venv env

 source env/bin/activate  # Mac/Linux
 env\Scripts\activate     # Windows
 
 pip install -r requirements.txt

API Keys (Phase 1)

In the initial phase, you will need an OpenAI API key (for GPT), plus any other keys for TTS or speech recognition if not handled locally.

Create a file named .env in the project’s root directory:

 # .env
 OPENAI_API_KEY=your_api_key
 OTHER_API_KEYS=...
 ...

Then, load it in your scripts:

 from dotenv import load_dotenv
 load_dotenv()
 import os
 
 OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

IMPORTANT: Never commit API keys or .env files to Git.

When using openai, setting the API key should look like this:

import openai

openai.api_key = os.getenv("OPENAI_API_KEY")

Using LangChain

We will make use of LangChain to handle LLM-driven workflows. Once your .env is set and loaded, LangChain will automatically pick up environment variables (e.g., OPENAI_API_KEY). If not, do the following:

X_API_KEY = os.getenv("API_KEY_NAME")
# then, pass the API KEY variable where necessary

Local Hosting (Phase 2)

After learning to integrate cloud APIs, we will shift towards offline hosting. Tools we’ll be using include:

Ollama for local large language models.
Hugging Face Transformers for offline NLP.

Deliverables

By the end of the project, you should have:

A speech-to-text pipeline that captures voice commands.
A text-to-speech engine that vocalizes JARVIS’ responses.
A command handling system capable of both basic (hard-coded) commands and dynamic commands powered by large language models (cloud or local).
An offline setup that relies on local models, culminating in a personal assistant that can handle open-ended queries, schedule reminders, and more — entirely on-device.

Other Resources

Python Basics: Official Python Documentation
Speech Recognition: SpeechRecognition Library, OpenAI Whisper
TTS: pyttsx3, gTTS (for cloud-based TTS)
Hugging Face: Transformers Documentation
Ollama: Ollama
OpenAI Whisper: OpenAI Whisper Repo

Acknowledgements

Project Leads

Aarushi Shah – [email protected]
Muhammad (Abubakar) Siddiq – [email protected]

Project Members

Alexander Devine
Kajal Patel
Luke Davey
Naveen Premkumar
Pear Seraypheap

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
week2.ipynb		week2.ipynb
week3.ipynb		week3.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

JARVIS

Skills Learned

Weekly Slides

Requirements

Project Timeline

Getting Started

1. Clone the Repository

2. Set Up Your Environment

API Keys (Phase 1)

Using LangChain

Local Hosting (Phase 2)

Deliverables

Other Resources

Acknowledgements

Project Leads

Project Members

About

Releases

Packages

Contributors 2

Languages

MichiganDataScienceTeam/W25-JARVIS

Folders and files

Latest commit

History

Repository files navigation

JARVIS

Skills Learned

Weekly Slides

Requirements

Project Timeline

Getting Started

1. Clone the Repository

2. Set Up Your Environment

API Keys (Phase 1)

Using LangChain

Local Hosting (Phase 2)

Deliverables

Other Resources

Acknowledgements

Project Leads

Project Members

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages