Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Co-authored-by: Vivek Upadhyay <[email protected]> #20

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 42 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,10 @@ A curated list of practical deep learning and machine learning project ideas

---

## i am contributing something to it

This tool is best for pair programming.

## Contents

- [Hackathon Ideas](#hackathon-ideas) - Project ideas unlocked by use of Large Language Models, specially text to text -- note that a lot of the text to text ideas can also be buit a lot better with LLMs now!
Expand All @@ -33,59 +37,59 @@ A curated list of practical deep learning and machine learning project ideas
## Hackathon Ideas

- **Developer Ideas**
- Text to cmd for terminal: Take user intent in terminal e.g.

- Text to cmd for terminal: Take user intent in terminal e.g.
```bash
$ask "how to list all files with details"
> Execute "ls -l"? [y/N] y
> Execute "ls -l"? [y/N] y
$ls -l
```
- Build and edit YAMLs using natural language e.g. Kubernetes and other form of config files
- [Kor](eyurtsev.github.io/kor/) for ideas on how this is done for JSON
- Can be use-case specific. Build pipelines? Kube?

- Mobile Android/iOS SDK for Stable Diffusion inference
- Apple has released a [CoreML Stable Diffusion Inference](https://github.com/apple/ml-stable-diffusion)
- Apple has released a [CoreML Stable Diffusion Inference](https://github.com/apple/ml-stable-diffusion)

- **Voice powered Experiences**

- Audio Conversation with chatGPT, can combine with fast Text-to-Speech e.g. [Eleven Labs](https://elevenlabs.io) to have a two-way conversation
- Telegram/WhatsApp bot to get audio and save as text with metadata into mem.ai or Roam Research or Obsidian

- Edit image by giving instructions of what you want to do: [SeeChatGPT](https://github.com/Nischaydnk/SeeChatGPT) and [playgroundai.com](playgroundai.com) as examples
- The underlying mechanism which you can use is called [InstructPix2Pix](huggingface.co/spaces/timbrooks/instruct-pix2pix)

- Semantic search over any media

- Can build using CLIP or [BLIP-2 embeddings](huggingface.co/docs/transformers/main/model_doc/blip-2) for images and [CLAP](https://github.com/LAION-AI/CLAP/tree/clap#quick-start) for all audio including music and speech

- Text to Music Generation
- See [MusicLM](https://google-research.github.io/seanet/musiclm/examples/) for reference

- **Knowledge Base QA** aka Answer Engines

- Take any plaintext dataset e.g. State of the Union address and build on top of that
![image](https://user-images.githubusercontent.com/3250749/223094577-8126570b-f7a4-48ad-9f77-ff86a8b21161.png)
- Take any plaintext dataset e.g. State of the Union address and build on top of that
![image](https://user-images.githubusercontent.com/3250749/223094577-8126570b-f7a4-48ad-9f77-ff86a8b21161.png)
- Can use this over Video Subtitles to search and QA over videos as well, by mapping back to source

- **Guided Summarisation/Rewriting**

- Take specific questions which the user might have about a large text dataset e.g. a novel or book and include that in your summary of the piece
- Pay attention to specific entities and retell the events which happen in a story with attention to that character

- **ControlNet + Stable Diffusion for Aethetic Control**
- Build tooling using [diffusers](https://github.com/huggingface/diffusers/) which takes in a set of photos, finetunes a model (LoRA) on a person, detects face and moves it to a new aesthetic e.g. futuristic neon punk, grunge rock, Studio Ghibli. Can also add InstructPix2Pix to give user more control.

- Build tooling using [diffusers](https://github.com/huggingface/diffusers/) which takes in a set of photos, finetunes a model (LoRA) on a person, detects face and moves it to a new aesthetic e.g. futuristic neon punk, grunge rock, Studio Ghibli. Can also add InstructPix2Pix to give user more control.
- **Text to Code/SQL**

- Use code understanding to convert use query to SQL or another executable programming language, including Domain Specific Languages
- Here is an example of the same: [qabot](github.com/hardbyte/qabot)

## Text

- **Autonomous Tagging of StackOverflow Questions**
- Make a multi-label classification system that automatically assigns tags for questions posted on a forum such as StackOverflow or Quora.
- Dataset: [StackLite](https://www.kaggle.com/stackoverflow/stacklite) or [10% sample](https://www.kaggle.com/stackoverflow/stacksample)

- Make a multi-label classification system that automatically assigns tags for questions posted on a forum such as StackOverflow or Quora.
- Dataset: [StackLite](https://www.kaggle.com/stackoverflow/stacklite) or [10% sample](https://www.kaggle.com/stackoverflow/stacksample)

- **Keyword/Concept identification**

- Identify keywords from millions of questions
- Dataset: [StackOverflow question samples by Facebook](https://www.kaggle.com/c/facebook-recruiting-iii-keyword-extraction/data)

Expand All @@ -96,19 +100,23 @@ A curated list of practical deep learning and machine learning project ideas
### Natural Language Understanding

- **Sentence to Sentence semantic similarity**

- Can you identify question pairs that have the same intent or meaning?
- Dataset: [Quora question pairs](https://www.kaggle.com/c/quora-question-pairs/data) with similar questions marked

- **Fight online abuse**

- Can you confidently and accurately tell whether a particular comment is abusive?
- Dataset: [Toxic comments on Kaggle](https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge)

- **Open Domain question answering**

- Can you build a bot which answers questions according to the student's age or her curriculum?
- [Facebook's FAIR](https://github.com/facebookresearch/DrQA) is built in a similar way for Wikipedia.
- Dataset: [NCERT books](https://ncert.nic.in/textbook.php) for K-12/school students in India, [NarrativeQA by Google DeepMind](https://github.com/deepmind/narrativeqa) and [SQuAD by Stanford](https://rajpurkar.github.io/SQuAD-explorer/)

- **Automatic text summarization**

- Can you create a summary with the major points of the original document?
- Abstractive (write your own summary) and Extractive (select pieces of text from original) are two popular approaches
- Dataset: [CNN and DailyMail News Pieces](http://cs.nyu.edu/~kcho/DMQA/) by Google DeepMind
Expand All @@ -117,7 +125,7 @@ A curated list of practical deep learning and machine learning project ideas
- Generate plausible new text which looks like some other text
- Obama Speeches? For instance, you can create a bot which writes some [new speeches in Obama's style](https://medium.com/@samim/obama-rnn-machine-generated-political-speeches-c8abd18a2ea0)
- Trump Bot? Or a Twitter bot which mimics [@realDonaldTrump](http://www.twitter.com/@realdonaldtrump)
- Narendra Modi bot saying "*doston*"? Start by scrapping off his *Hindi* speeches from his [personal website](http://www.narendramodi.in)
- Narendra Modi bot saying "_doston_"? Start by scrapping off his _Hindi_ speeches from his [personal website](http://www.narendramodi.in)
- Example Dataset: [English Transcript of Modi speeches](https://github.com/mgupta1410/pm_modi_speeches_repo)

Check [mlm/blog](http://machinelearningmastery.com/text-generation-lstm-recurrent-neural-networks-python-keras/) for some hints.
Expand All @@ -129,14 +137,17 @@ Check [mlm/blog](http://machinelearningmastery.com/text-generation-lstm-recurren
## Forecasting

- **Univariate Time Series Forecasting**

- How much will it rain this year?
- Dataset: [45 years of rainfall data](http://research.jisao.washington.edu/data_sets/widmann/)

- **Multi-variate Time Series Forecasting**

- How polluted will your town's air be? Pollution Level Forecasting
- Dataset: [Air Quality dataset](https://archive.ics.uci.edu/ml/datasets/Beijing+PM2.5+Data)

- **Demand/load forecasting**

- Find a short term forecast on electricity consumption of a single home
- Dataset: [Electricity consumption of a household](https://archive.ics.uci.edu/ml/datasets/individual+household+electric+power+consumption)

Expand All @@ -148,11 +159,13 @@ Check [mlm/blog](http://machinelearningmastery.com/text-generation-lstm-recurren
## Recommendation systems

- **Movie Recommender**

- Can you predict the rating a user will give on a movie?
- Do this using the movies that user has rated in the past, as well as the ratings similar users have given similar movies.
- Dataset: [Netflix Prize](http://www.netflixprize.com/) and [MovieLens Datasets](https://grouplens.org/datasets/movielens/)

- **Search + Recommendation System**

- Predict which Xbox game a visitor will be most interested in based on their search query
- Dataset: [BestBuy](https://www.kaggle.com/c/acm-sf-chapter-hackathon-small/data)

Expand All @@ -163,12 +176,13 @@ Check [mlm/blog](http://machinelearningmastery.com/text-generation-lstm-recurren
## Vision

- **Image classification**

- Object recognition or image classification task is how Deep Learning shot up to it's present-day resurgence
- Datasets:
- [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html)
- [ImageNet](http://www.image-net.org/)
- [MS COCO](http://mscoco.org/) is the modern replacement to the ImageNet challenge
- [MNIST Handwritten Digit Classification Challenge](http://yann.lecun.com/exdb/mnist/) is the classic entry point
- [MNIST Handwritten Digit Classification Challenge](http://yann.lecun.com/exdb/mnist/) is the classic entry point
- [Character recognition (digits)](http://ai.stanford.edu/~btaskar/ocr/) is the good old Optical Character Recognition problem
- Bird Species Identification from an Image using the [Caltech-UCSD Birds dataset](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html) dataset
- Diagnosing and Segmenting Brain Tumors and Phenotypes using MRI Scans
Expand All @@ -179,39 +193,48 @@ Check [mlm/blog](http://machinelearningmastery.com/text-generation-lstm-recurren
- Dataset: [State Farm Distracted Driver Detection](https://www.kaggle.com/c/state-farm-distracted-driver-detection/data) on Kaggle

- **Bone X-Ray competition**

- Can you identify if a hand is broken from a X-ray radiographs automatically with better than human performance?
- Stanford's Bone XRay Deep Learning Competition with [MURA Dataset](https://stanfordmlgroup.github.io/competitions/mura/)

- **Image Captioning**

- Can you caption/explain the photo a way human would?
- Dataset: [MS COCO](http://mscoco.org/dataset/#captions-challenge2015)

- **Image Segmentation/Object Detection**

- Can you extract an object of interest from an image?
- Dataset: [MS COCO](http://mscoco.org/dataset/#detections-challenge2017), [Carvana Image Masking Challenge](https://www.kaggle.com/c/carvana-image-masking-challenge/data) on Kaggle

- **Large-Scale Video Understanding**

- Can you produce the best video tag predictions?
- Dataset: [YouTube 8M](https://research.google.com/youtube8m/index.html)

- **Video Summarization**

- Can you select the semantically relevant/important parts from the video?
- Example: [Fast-Forward Video Based on Semantic Extraction](https://arxiv.org/abs/1708.04160)
- Dataset: Unaware of any standard dataset or agreed upon metrics? I think [YouTube 8M](https://research.google.com/youtube8m/index.html) might be good starting point.

- **Style Transfer**

- Can you recompose images in the style of other images?
- Dataset: [fzliu on GitHub](https://github.com/fzliu/style-transfer/tree/master/images) shared target and source images with results

- **Chest XRay**

- Can you detect if someone is sick from their chest XRay? Or guess their radiology report?
- Dataset: [MIMIC-CXR at Physionet](https://physionet.org/content/mimic-cxr/2.0.0/)

- **Clinical Diagnostics: Image Identification, classification & segmentation**

- Can you help build an open source software for lung cancer detection to help radiologists?
- Link: [Concept to clinic](https://concepttoclinic.drivendata.org/) challenge on DrivenData

- **Satellite Imagery Processing for Socioeconomic Analysis**

- Can you estimate the standard of living or energy consumption of a place from night time satellite imagery?
- Reference for Project details: [Stanford Poverty Estimation Project](http://sustain.stanford.edu/predicting-poverty/)

Expand All @@ -222,6 +245,7 @@ Check [mlm/blog](http://machinelearningmastery.com/text-generation-lstm-recurren
## Music

- **Music/Audio Recommendation Systems**

- Can you tell if two songs are similar using their sound or lyrics?
- Dataset: [Million Songs Dataset](https://labrosa.ee.columbia.edu/millionsong/) and it's 1% sample.
- Example: [Anusha et al](https://cs224d.stanford.edu/reports/BalakrishnanDixit.pdf)
Expand All @@ -231,8 +255,6 @@ Check [mlm/blog](http://machinelearningmastery.com/text-generation-lstm-recurren
- Datasets: [FMA](https://github.com/mdeff/fma) or [GTZAN on Keras](https://github.com/Hguimaraes/gtzan.keras)
- Get started with [Librosa](https://librosa.github.io/librosa/index.html) for feature extraction



---

### FAQ
Expand Down