From 1054030022af6d8fad6421af2e408034ae32d96e Mon Sep 17 00:00:00 2001 From: harsh Date: Sun, 9 Feb 2025 16:39:22 +0530 Subject: [PATCH] Co-authored-by: Vivek Upadhyay --- README.md | 62 +++++++++++++++++++++++++++++++++++++------------------ 1 file changed, 42 insertions(+), 20 deletions(-) diff --git a/README.md b/README.md index ce8bc37..9d2bbf7 100644 --- a/README.md +++ b/README.md @@ -12,6 +12,10 @@ A curated list of practical deep learning and machine learning project ideas --- +## i am contributing something to it + +This tool is best for pair programming. + ## Contents - [Hackathon Ideas](#hackathon-ideas) - Project ideas unlocked by use of Large Language Models, specially text to text -- note that a lot of the text to text ideas can also be buit a lot better with LLMs now! @@ -33,59 +37,59 @@ A curated list of practical deep learning and machine learning project ideas ## Hackathon Ideas - **Developer Ideas** - - Text to cmd for terminal: Take user intent in terminal e.g. + + - Text to cmd for terminal: Take user intent in terminal e.g. ```bash $ask "how to list all files with details" - > Execute "ls -l"? [y/N] y + > Execute "ls -l"? [y/N] y $ls -l ``` - Build and edit YAMLs using natural language e.g. Kubernetes and other form of config files - [Kor](eyurtsev.github.io/kor/) for ideas on how this is done for JSON - Can be use-case specific. Build pipelines? Kube? - - Mobile Android/iOS SDK for Stable Diffusion inference - - Apple has released a [CoreML Stable Diffusion Inference](https://github.com/apple/ml-stable-diffusion) + - Apple has released a [CoreML Stable Diffusion Inference](https://github.com/apple/ml-stable-diffusion) - **Voice powered Experiences** + - Audio Conversation with chatGPT, can combine with fast Text-to-Speech e.g. [Eleven Labs](https://elevenlabs.io) to have a two-way conversation - Telegram/WhatsApp bot to get audio and save as text with metadata into mem.ai or Roam Research or Obsidian - Edit image by giving instructions of what you want to do: [SeeChatGPT](https://github.com/Nischaydnk/SeeChatGPT) and [playgroundai.com](playgroundai.com) as examples - The underlying mechanism which you can use is called [InstructPix2Pix](huggingface.co/spaces/timbrooks/instruct-pix2pix) - - Semantic search over any media + - Can build using CLIP or [BLIP-2 embeddings](huggingface.co/docs/transformers/main/model_doc/blip-2) for images and [CLAP](https://github.com/LAION-AI/CLAP/tree/clap#quick-start) for all audio including music and speech - Text to Music Generation - See [MusicLM](https://google-research.github.io/seanet/musiclm/examples/) for reference - - **Knowledge Base QA** aka Answer Engines - - Take any plaintext dataset e.g. State of the Union address and build on top of that - ![image](https://user-images.githubusercontent.com/3250749/223094577-8126570b-f7a4-48ad-9f77-ff86a8b21161.png) + - Take any plaintext dataset e.g. State of the Union address and build on top of that + ![image](https://user-images.githubusercontent.com/3250749/223094577-8126570b-f7a4-48ad-9f77-ff86a8b21161.png) - Can use this over Video Subtitles to search and QA over videos as well, by mapping back to source - **Guided Summarisation/Rewriting** - + - Take specific questions which the user might have about a large text dataset e.g. a novel or book and include that in your summary of the piece - Pay attention to specific entities and retell the events which happen in a story with attention to that character - + - **ControlNet + Stable Diffusion for Aethetic Control** - - Build tooling using [diffusers](https://github.com/huggingface/diffusers/) which takes in a set of photos, finetunes a model (LoRA) on a person, detects face and moves it to a new aesthetic e.g. futuristic neon punk, grunge rock, Studio Ghibli. Can also add InstructPix2Pix to give user more control. - + - Build tooling using [diffusers](https://github.com/huggingface/diffusers/) which takes in a set of photos, finetunes a model (LoRA) on a person, detects face and moves it to a new aesthetic e.g. futuristic neon punk, grunge rock, Studio Ghibli. Can also add InstructPix2Pix to give user more control. - **Text to Code/SQL** - Use code understanding to convert use query to SQL or another executable programming language, including Domain Specific Languages - Here is an example of the same: [qabot](github.com/hardbyte/qabot) - + ## Text - **Autonomous Tagging of StackOverflow Questions** - - Make a multi-label classification system that automatically assigns tags for questions posted on a forum such as StackOverflow or Quora. - - Dataset: [StackLite](https://www.kaggle.com/stackoverflow/stacklite) or [10% sample](https://www.kaggle.com/stackoverflow/stacksample) + + - Make a multi-label classification system that automatically assigns tags for questions posted on a forum such as StackOverflow or Quora. + - Dataset: [StackLite](https://www.kaggle.com/stackoverflow/stacklite) or [10% sample](https://www.kaggle.com/stackoverflow/stacksample) - **Keyword/Concept identification** - + - Identify keywords from millions of questions - Dataset: [StackOverflow question samples by Facebook](https://www.kaggle.com/c/facebook-recruiting-iii-keyword-extraction/data) @@ -96,19 +100,23 @@ A curated list of practical deep learning and machine learning project ideas ### Natural Language Understanding - **Sentence to Sentence semantic similarity** + - Can you identify question pairs that have the same intent or meaning? - Dataset: [Quora question pairs](https://www.kaggle.com/c/quora-question-pairs/data) with similar questions marked - **Fight online abuse** + - Can you confidently and accurately tell whether a particular comment is abusive? - Dataset: [Toxic comments on Kaggle](https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge) - **Open Domain question answering** + - Can you build a bot which answers questions according to the student's age or her curriculum? - [Facebook's FAIR](https://github.com/facebookresearch/DrQA) is built in a similar way for Wikipedia. - Dataset: [NCERT books](https://ncert.nic.in/textbook.php) for K-12/school students in India, [NarrativeQA by Google DeepMind](https://github.com/deepmind/narrativeqa) and [SQuAD by Stanford](https://rajpurkar.github.io/SQuAD-explorer/) - **Automatic text summarization** + - Can you create a summary with the major points of the original document? - Abstractive (write your own summary) and Extractive (select pieces of text from original) are two popular approaches - Dataset: [CNN and DailyMail News Pieces](http://cs.nyu.edu/~kcho/DMQA/) by Google DeepMind @@ -117,7 +125,7 @@ A curated list of practical deep learning and machine learning project ideas - Generate plausible new text which looks like some other text - Obama Speeches? For instance, you can create a bot which writes some [new speeches in Obama's style](https://medium.com/@samim/obama-rnn-machine-generated-political-speeches-c8abd18a2ea0) - Trump Bot? Or a Twitter bot which mimics [@realDonaldTrump](http://www.twitter.com/@realdonaldtrump) - - Narendra Modi bot saying "*doston*"? Start by scrapping off his *Hindi* speeches from his [personal website](http://www.narendramodi.in) + - Narendra Modi bot saying "_doston_"? Start by scrapping off his _Hindi_ speeches from his [personal website](http://www.narendramodi.in) - Example Dataset: [English Transcript of Modi speeches](https://github.com/mgupta1410/pm_modi_speeches_repo) Check [mlm/blog](http://machinelearningmastery.com/text-generation-lstm-recurrent-neural-networks-python-keras/) for some hints. @@ -129,14 +137,17 @@ Check [mlm/blog](http://machinelearningmastery.com/text-generation-lstm-recurren ## Forecasting - **Univariate Time Series Forecasting** + - How much will it rain this year? - Dataset: [45 years of rainfall data](http://research.jisao.washington.edu/data_sets/widmann/) - **Multi-variate Time Series Forecasting** + - How polluted will your town's air be? Pollution Level Forecasting - Dataset: [Air Quality dataset](https://archive.ics.uci.edu/ml/datasets/Beijing+PM2.5+Data) - **Demand/load forecasting** + - Find a short term forecast on electricity consumption of a single home - Dataset: [Electricity consumption of a household](https://archive.ics.uci.edu/ml/datasets/individual+household+electric+power+consumption) @@ -148,11 +159,13 @@ Check [mlm/blog](http://machinelearningmastery.com/text-generation-lstm-recurren ## Recommendation systems - **Movie Recommender** + - Can you predict the rating a user will give on a movie? - Do this using the movies that user has rated in the past, as well as the ratings similar users have given similar movies. - Dataset: [Netflix Prize](http://www.netflixprize.com/) and [MovieLens Datasets](https://grouplens.org/datasets/movielens/) - **Search + Recommendation System** + - Predict which Xbox game a visitor will be most interested in based on their search query - Dataset: [BestBuy](https://www.kaggle.com/c/acm-sf-chapter-hackathon-small/data) @@ -163,12 +176,13 @@ Check [mlm/blog](http://machinelearningmastery.com/text-generation-lstm-recurren ## Vision - **Image classification** + - Object recognition or image classification task is how Deep Learning shot up to it's present-day resurgence - Datasets: - [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html) - [ImageNet](http://www.image-net.org/) - [MS COCO](http://mscoco.org/) is the modern replacement to the ImageNet challenge - - [MNIST Handwritten Digit Classification Challenge](http://yann.lecun.com/exdb/mnist/) is the classic entry point + - [MNIST Handwritten Digit Classification Challenge](http://yann.lecun.com/exdb/mnist/) is the classic entry point - [Character recognition (digits)](http://ai.stanford.edu/~btaskar/ocr/) is the good old Optical Character Recognition problem - Bird Species Identification from an Image using the [Caltech-UCSD Birds dataset](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html) dataset - Diagnosing and Segmenting Brain Tumors and Phenotypes using MRI Scans @@ -179,39 +193,48 @@ Check [mlm/blog](http://machinelearningmastery.com/text-generation-lstm-recurren - Dataset: [State Farm Distracted Driver Detection](https://www.kaggle.com/c/state-farm-distracted-driver-detection/data) on Kaggle - **Bone X-Ray competition** + - Can you identify if a hand is broken from a X-ray radiographs automatically with better than human performance? - Stanford's Bone XRay Deep Learning Competition with [MURA Dataset](https://stanfordmlgroup.github.io/competitions/mura/) - **Image Captioning** + - Can you caption/explain the photo a way human would? - Dataset: [MS COCO](http://mscoco.org/dataset/#captions-challenge2015) - **Image Segmentation/Object Detection** + - Can you extract an object of interest from an image? - Dataset: [MS COCO](http://mscoco.org/dataset/#detections-challenge2017), [Carvana Image Masking Challenge](https://www.kaggle.com/c/carvana-image-masking-challenge/data) on Kaggle - **Large-Scale Video Understanding** + - Can you produce the best video tag predictions? - Dataset: [YouTube 8M](https://research.google.com/youtube8m/index.html) - **Video Summarization** + - Can you select the semantically relevant/important parts from the video? - Example: [Fast-Forward Video Based on Semantic Extraction](https://arxiv.org/abs/1708.04160) - Dataset: Unaware of any standard dataset or agreed upon metrics? I think [YouTube 8M](https://research.google.com/youtube8m/index.html) might be good starting point. - **Style Transfer** + - Can you recompose images in the style of other images? - Dataset: [fzliu on GitHub](https://github.com/fzliu/style-transfer/tree/master/images) shared target and source images with results - **Chest XRay** + - Can you detect if someone is sick from their chest XRay? Or guess their radiology report? - Dataset: [MIMIC-CXR at Physionet](https://physionet.org/content/mimic-cxr/2.0.0/) - **Clinical Diagnostics: Image Identification, classification & segmentation** + - Can you help build an open source software for lung cancer detection to help radiologists? - Link: [Concept to clinic](https://concepttoclinic.drivendata.org/) challenge on DrivenData - **Satellite Imagery Processing for Socioeconomic Analysis** + - Can you estimate the standard of living or energy consumption of a place from night time satellite imagery? - Reference for Project details: [Stanford Poverty Estimation Project](http://sustain.stanford.edu/predicting-poverty/) @@ -222,6 +245,7 @@ Check [mlm/blog](http://machinelearningmastery.com/text-generation-lstm-recurren ## Music - **Music/Audio Recommendation Systems** + - Can you tell if two songs are similar using their sound or lyrics? - Dataset: [Million Songs Dataset](https://labrosa.ee.columbia.edu/millionsong/) and it's 1% sample. - Example: [Anusha et al](https://cs224d.stanford.edu/reports/BalakrishnanDixit.pdf) @@ -231,8 +255,6 @@ Check [mlm/blog](http://machinelearningmastery.com/text-generation-lstm-recurren - Datasets: [FMA](https://github.com/mdeff/fma) or [GTZAN on Keras](https://github.com/Hguimaraes/gtzan.keras) - Get started with [Librosa](https://librosa.github.io/librosa/index.html) for feature extraction - - --- ### FAQ