Hello Y'all, I've made something: Corpus2GPT
#19702
abhaskumarsinha
started this conversation in
Show and tell
Replies: 1 comment 1 reply
-
Nice project! Thanks for sharing. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Introducing
Corpus2GPT
While Keras-NLP, HuggingFace, and FAIR, have a good codebase of several LLMs and tools revolving around LLMs, they often depend upon highly complicated codebases that are dependent on very large codebases, often for providing easy functionalities for the engineers to construct, train, re-train, fine-tune and deploy their code.
One major problem with that is that the codebase is too big to either read or make any fundamental structural change that would hurdle fundamental research, experimentation, and testing.
With Corpus2GPT, one can easily work to take upon any task of research very easily as code bases comprise tiny small modules that can easily be read, and altered, docs that contain formulas, citations, and explanations that make our life easier. Tools for research-ready environments and utils to work with scaling problems are easily available.
Currently, it is at a very naive stage supporting very basic training from any corpus of any English/non-English language with very few lines of code and corpus text in a distribution-friendly manner (supporting all CPU, GPU & TPUs) - supporting all three backends - PyT, TF, JAX. But we plan to improve and add more state-of-the-art features soon to enable all the latest SoTA research functionalities and utils in a single place for easy research and open-source contribution.
Doesn't matter if you are a programmer, engineer, researcher, linguist, literary enthusiast, student, or industry professional, you can get a lot out of Corpus2GPT.
Links
Repo: https://github.com/abhaskumarsinha/Corpus2GPT/tree/main
Website: https://abhaskumarsinha.github.io/Corpus2GPT/
Alternatives To: https://alternativeto.net/software/corpus2gpt/about/
Docs: https://abhaskumarsinha.github.io/Corpus2GPT/docs/doc/index.html
Example Code:
#MadeWithKeras 💖 #OpenSource 💚 #NLPResearch 📚 #AIInnovation 🤖 #CodeSimplicity 🛠️ #LanguageModeling 🗣️ #ResearchMadeEasy 📊 #TechForGood 🌟 #SimplifyCoding ✨ #InnovativeTech 🚀 #AccessibleAI 🔍 #OpenSourceCommunity 🌐 #MLDevelopment 🧠 #FutureTech 🔮 #EmpoweringEngineers 💪 #EnhancedLearning 📖 #UserFriendlyTools 🖥️ #StreamlinedResearch 📝 #EffortlessDeployment 💻 #AIProgress 🌱 #TechSolutions 💡 #SmartCoding 📝 #LanguageUnderstanding 🧠
Beta Was this translation helpful? Give feedback.
All reactions