NirantK · harshcoder7 · Feb 9, 2025
diff --git a/README.md b/README.md
@@ -12,6 +12,10 @@ A curated list of practical deep learning and machine learning project ideas
 
 ---
 
+## i am contributing something to it
+
+This tool is best for pair programming.
+
 ## Contents
 
 - [Hackathon Ideas](#hackathon-ideas) - Project ideas unlocked by use of Large Language Models, specially text to text -- note that a lot of the text to text ideas can also be buit a lot better with LLMs now!
@@ -33,59 +37,59 @@ A curated list of practical deep learning and machine learning project ideas
 ## Hackathon Ideas
 
 - **Developer Ideas**
-  - Text to cmd for terminal: Take user intent in terminal e.g. 
+
+  - Text to cmd for terminal: Take user intent in terminal e.g.
     ```bash
     $ask "how to list all files with details"
-    > Execute "ls -l"? [y/N] y 
+    > Execute "ls -l"? [y/N] y
     $ls -l
     ```
   - Build and edit YAMLs using natural language e.g. Kubernetes and other form of config files
     - [Kor](eyurtsev.github.io/kor/) for ideas on how this is done for JSON
     - Can be use-case specific. Build pipelines? Kube?
-
   - Mobile Android/iOS SDK for Stable Diffusion inference
-    -  Apple has released a [CoreML Stable Diffusion Inference](https://github.com/apple/ml-stable-diffusion)
+    - Apple has released a [CoreML Stable Diffusion Inference](https://github.com/apple/ml-stable-diffusion)
 
 - **Voice powered Experiences**
+
   - Audio Conversation with chatGPT, can combine with fast Text-to-Speech e.g. [Eleven Labs](https://elevenlabs.io) to have a two-way conversation
   - Telegram/WhatsApp bot to get audio and save as text with metadata into mem.ai or Roam Research or Obsidian
 
 - Edit image by giving instructions of what you want to do: [SeeChatGPT](https://github.com/Nischaydnk/SeeChatGPT) and [playgroundai.com](playgroundai.com) as examples
   - The underlying mechanism which you can use is called [InstructPix2Pix](huggingface.co/spaces/timbrooks/instruct-pix2pix)
-
 - Semantic search over any media
+
   - Can build using CLIP or [BLIP-2 embeddings](huggingface.co/docs/transformers/main/model_doc/blip-2) for images and [CLAP](https://github.com/LAION-AI/CLAP/tree/clap#quick-start) for all audio including music and speech
 
 - Text to Music Generation
   - See [MusicLM](https://google-research.github.io/seanet/musiclm/examples/) for reference
-
 - **Knowledge Base QA** aka Answer Engines
 
-  - Take any plaintext dataset e.g. State of the Union address and build on top of that 
-  ![image](https://user-images.githubusercontent.com/3250749/223094577-8126570b-f7a4-48ad-9f77-ff86a8b21161.png)
+  - Take any plaintext dataset e.g. State of the Union address and build on top of that
+    ![image](https://user-images.githubusercontent.com/3250749/223094577-8126570b-f7a4-48ad-9f77-ff86a8b21161.png)
   - Can use this over Video Subtitles to search and QA over videos as well, by mapping back to source
 
 - **Guided Summarisation/Rewriting**
-  
+
   - Take specific questions which the user might have about a large text dataset e.g. a novel or book and include that in your summary of the piece
   - Pay attention to specific entities and retell the events which happen in a story with attention to that character
-  
+
 - **ControlNet + Stable Diffusion for Aethetic Control**
-  - Build tooling using [diffusers](https://github.com/huggingface/diffusers/) which takes in a set of photos, finetunes a model (LoRA) on a person, detects face and moves it to a new aesthetic e.g. futuristic neon punk, grunge rock, Studio Ghibli. Can also add InstructPix2Pix to give user more control. 
-
+  - Build tooling using [diffusers](https://github.com/huggingface/diffusers/) which takes in a set of photos, finetunes a model (LoRA) on a person, detects face and moves it to a new aesthetic e.g. futuristic neon punk, grunge rock, Studio Ghibli. Can also add InstructPix2Pix to give user more control.
 - **Text to Code/SQL**
 
   - Use code understanding to convert use query to SQL or another executable programming language, including Domain Specific Languages
   - Here is an example of the same: [qabot](github.com/hardbyte/qabot)
-  
+
 ## Text
 
 - **Autonomous Tagging of StackOverflow Questions**
-    - Make a multi-label classification system that automatically assigns tags for questions posted on a forum such as StackOverflow or Quora.
-    - Dataset: [StackLite](https://www.kaggle.com/stackoverflow/stacklite) or [10% sample](https://www.kaggle.com/stackoverflow/stacksample)
+
+  - Make a multi-label classification system that automatically assigns tags for questions posted on a forum such as StackOverflow or Quora.
+  - Dataset: [StackLite](https://www.kaggle.com/stackoverflow/stacklite) or [10% sample](https://www.kaggle.com/stackoverflow/stacksample)
 
 - **Keyword/Concept identification**
-  
+
   - Identify keywords from millions of questions
   - Dataset: [StackOverflow question samples by Facebook](https://www.kaggle.com/c/facebook-recruiting-iii-keyword-extraction/data)
 
@@ -96,19 +100,23 @@ A curated list of practical deep learning and machine learning project ideas
 ### Natural Language Understanding
 
 - **Sentence to Sentence semantic similarity**
+
   - Can you identify question pairs that have the same intent or meaning?
   - Dataset: [Quora question pairs](https://www.kaggle.com/c/quora-question-pairs/data) with similar questions marked
 
 - **Fight online abuse**
+
   - Can you confidently and accurately tell whether a particular comment is abusive?
   - Dataset: [Toxic comments on Kaggle](https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge)
 
 - **Open Domain question answering**
+
   - Can you build a bot which answers questions according to the student's age or her curriculum?
   - [Facebook's FAIR](https://github.com/facebookresearch/DrQA) is built in a similar way for Wikipedia.
   - Dataset: [NCERT books](https://ncert.nic.in/textbook.php) for K-12/school students in India, [NarrativeQA by Google DeepMind](https://github.com/deepmind/narrativeqa) and [SQuAD by Stanford](https://rajpurkar.github.io/SQuAD-explorer/)
 
 - **Automatic text summarization**
+
   - Can you create a summary with the major points of the original document?
   - Abstractive (write your own summary) and Extractive (select pieces of text from original) are two popular approaches
   - Dataset: [CNN and DailyMail News Pieces](http://cs.nyu.edu/~kcho/DMQA/) by Google DeepMind
@@ -117,7 +125,7 @@ A curated list of practical deep learning and machine learning project ideas
   - Generate plausible new text which looks like some other text
   - Obama Speeches? For instance, you can create a bot which writes some [new speeches in Obama's style](https://medium.com/@samim/obama-rnn-machine-generated-political-speeches-c8abd18a2ea0)
   - Trump Bot? Or a Twitter bot which mimics [@realDonaldTrump](http://www.twitter.com/@realdonaldtrump)
-  - Narendra Modi bot saying "*doston*"? Start by scrapping off his *Hindi* speeches from his [personal website](http://www.narendramodi.in)
+  - Narendra Modi bot saying "_doston_"? Start by scrapping off his _Hindi_ speeches from his [personal website](http://www.narendramodi.in)
   - Example Dataset: [English Transcript of Modi speeches](https://github.com/mgupta1410/pm_modi_speeches_repo)
 
 Check [mlm/blog](http://machinelearningmastery.com/text-generation-lstm-recurrent-neural-networks-python-keras/) for some hints.
@@ -129,14 +137,17 @@ Check [mlm/blog](http://machinelearningmastery.com/text-generation-lstm-recurren
 ## Forecasting
 
 - **Univariate Time Series Forecasting**
+
   - How much will it rain this year?
   - Dataset: [45 years of rainfall data](http://research.jisao.washington.edu/data_sets/widmann/)
 
 - **Multi-variate Time Series Forecasting**
+
   - How polluted will your town's air be? Pollution Level Forecasting
   - Dataset: [Air Quality dataset](https://archive.ics.uci.edu/ml/datasets/Beijing+PM2.5+Data)
 
 - **Demand/load forecasting**
+
   - Find a short term forecast on electricity consumption of a single home
   - Dataset: [Electricity consumption of a household](https://archive.ics.uci.edu/ml/datasets/individual+household+electric+power+consumption)
 
@@ -148,11 +159,13 @@ Check [mlm/blog](http://machinelearningmastery.com/text-generation-lstm-recurren
 ## Recommendation systems
 
 - **Movie Recommender**
+
   - Can you predict the rating a user will give on a movie?
   - Do this using the movies that user has rated in the past, as well as the ratings similar users have given similar movies.
   - Dataset: [Netflix Prize](http://www.netflixprize.com/) and [MovieLens Datasets](https://grouplens.org/datasets/movielens/)
 
 - **Search + Recommendation System**
+
   - Predict which Xbox game a visitor will be most interested in based on their search query
   - Dataset: [BestBuy](https://www.kaggle.com/c/acm-sf-chapter-hackathon-small/data)
 
@@ -163,12 +176,13 @@ Check [mlm/blog](http://machinelearningmastery.com/text-generation-lstm-recurren
 ## Vision
 
 - **Image classification**
+
   - Object recognition or image classification task is how Deep Learning shot up to it's present-day resurgence
   - Datasets:
     - [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html)
     - [ImageNet](http://www.image-net.org/)
     - [MS COCO](http://mscoco.org/) is the modern replacement to the ImageNet challenge
-    - [MNIST Handwritten Digit Classification Challenge](http://yann.lecun.com/exdb/mnist/)  is the classic entry point
+    - [MNIST Handwritten Digit Classification Challenge](http://yann.lecun.com/exdb/mnist/) is the classic entry point
     - [Character recognition (digits)](http://ai.stanford.edu/~btaskar/ocr/) is the good old Optical Character Recognition problem
     - Bird Species Identification from an Image using the [Caltech-UCSD Birds dataset](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html) dataset
   - Diagnosing and Segmenting Brain Tumors and Phenotypes using MRI Scans
@@ -179,39 +193,48 @@ Check [mlm/blog](http://machinelearningmastery.com/text-generation-lstm-recurren
     - Dataset: [State Farm Distracted Driver Detection](https://www.kaggle.com/c/state-farm-distracted-driver-detection/data) on Kaggle
 
 - **Bone X-Ray competition**
+
   - Can you identify if a hand is broken from a X-ray radiographs automatically with better than human performance?
   - Stanford's Bone XRay Deep Learning Competition with [MURA Dataset](https://stanfordmlgroup.github.io/competitions/mura/)
 
 - **Image Captioning**
+
   - Can you caption/explain the photo a way human would?
   - Dataset: [MS COCO](http://mscoco.org/dataset/#captions-challenge2015)
 
 - **Image Segmentation/Object Detection**
+
   - Can you extract an object of interest from an image?
   - Dataset: [MS COCO](http://mscoco.org/dataset/#detections-challenge2017), [Carvana Image Masking Challenge](https://www.kaggle.com/c/carvana-image-masking-challenge/data) on Kaggle
 
 - **Large-Scale Video Understanding**
+
   - Can you produce the best video tag predictions?
   - Dataset: [YouTube 8M](https://research.google.com/youtube8m/index.html)
 
 - **Video Summarization**
+
   - Can you select the semantically relevant/important parts from the video?
   - Example: [Fast-Forward Video Based on Semantic Extraction](https://arxiv.org/abs/1708.04160)
   - Dataset: Unaware of any standard dataset or agreed upon metrics? I think [YouTube 8M](https://research.google.com/youtube8m/index.html) might be good starting point.
 
 - **Style Transfer**
+
   - Can you recompose images in the style of other images?
   - Dataset: [fzliu on GitHub](https://github.com/fzliu/style-transfer/tree/master/images) shared target and source images with results
 
 - **Chest XRay**
+
   - Can you detect if someone is sick from their chest XRay? Or guess their radiology report?
   - Dataset: [MIMIC-CXR at Physionet](https://physionet.org/content/mimic-cxr/2.0.0/)
 
 - **Clinical Diagnostics: Image Identification, classification & segmentation**
+
   - Can you help build an open source software for lung cancer detection to help radiologists?
   - Link: [Concept to clinic](https://concepttoclinic.drivendata.org/) challenge on DrivenData
 
 - **Satellite Imagery Processing for Socioeconomic Analysis**
+
   - Can you estimate the standard of living or energy consumption of a place from night time satellite imagery?
   - Reference for Project details: [Stanford Poverty Estimation Project](http://sustain.stanford.edu/predicting-poverty/)
 
@@ -222,6 +245,7 @@ Check [mlm/blog](http://machinelearningmastery.com/text-generation-lstm-recurren
 ## Music
 
 - **Music/Audio Recommendation Systems**
+
   - Can you tell if two songs are similar using their sound or lyrics?
   - Dataset: [Million Songs Dataset](https://labrosa.ee.columbia.edu/millionsong/) and it's 1% sample.
   - Example: [Anusha et al](https://cs224d.stanford.edu/reports/BalakrishnanDixit.pdf)
@@ -231,8 +255,6 @@ Check [mlm/blog](http://machinelearningmastery.com/text-generation-lstm-recurren
   - Datasets: [FMA](https://github.com/mdeff/fma) or [GTZAN on Keras](https://github.com/Hguimaraes/gtzan.keras)
   - Get started with [Librosa](https://librosa.github.io/librosa/index.html) for feature extraction
 
-
-
 ---
 
 ### FAQ