You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DBRetina is a high-performance bioinformatics tool with an efficient linear algorithm for calculating the pairwise distance among large collections of gene sets. This algorithm enables easy construction of a comprehensive pairwise molecular similarity network within and across several molecular databases. To enable efficient search and visualization of this huge similarity network, DBRetina can transform the final output into a format compatible with the Neo4j graph databases.
Challenge:
While DBRetina bridges genomic analytics and graph databases, querying Neo4j requires Cypher query language expertise, limiting accessibility for non-technical researchers.
Goal and Aims
To develop an LLM-Driven chatbot that translates natural language questions into Cypher queries, enabling intuitive interaction with DBRetina-generated Neo4j graphs.
This chatbot aims to: Increase accessibility: Enables non-technical users to query complex genomic networks. Improve efficiency: Reduces query-writing time by ~70% (based on LLM benchmarks ). Scalability: Adapts to evolving graph schemas and supports multi-database integration.
Difficulty Level: Medium/Hard
Size and Length of Project
medium: 175 hours
12 -16 weeks
Skills
Essential skills: LLM fine tuning, Experience with Graph databases, HTML, CSS, JS
Nice to have skills: C++
Hi @MoHelmy , I was going through the projects listed by you, and while I believe I have the skills to contribute to either of them, this particular project excites me the most. Having previously worked on projects involving fine-tuning LLMs and exploring graph-based tools like Cytoscape, I feel this aligns well with my interests and experience.
Additionally, I am currently brushing up on my skills through active contributions to open-source projects, and also contributing actively in machine learning domain . I am very keen to contribute to this project and am confident in my ability to learn and deliver effectively.
Hi @MoHelmy , I’d like to contribute to this issue. I have experience in building full-stack websites and have previously worked on developing RAG pipeline for local LLM as part of my robotics project. Looking forward to your response
Introduction to DBRetina
DBRetina is a high-performance bioinformatics tool with an efficient linear algorithm for calculating the pairwise distance among large collections of gene sets. This algorithm enables easy construction of a comprehensive pairwise molecular similarity network within and across several molecular databases. To enable efficient search and visualization of this huge similarity network, DBRetina can transform the final output into a format compatible with the Neo4j graph databases.
Challenge:
While DBRetina bridges genomic analytics and graph databases, querying Neo4j requires Cypher query language expertise, limiting accessibility for non-technical researchers.
Goal and Aims
To develop an LLM-Driven chatbot that translates natural language questions into Cypher queries, enabling intuitive interaction with DBRetina-generated Neo4j graphs.
This chatbot aims to:
Increase accessibility: Enables non-technical users to query complex genomic networks.
Improve efficiency: Reduces query-writing time by ~70% (based on LLM benchmarks ).
Scalability: Adapts to evolving graph schemas and supports multi-database integration.
Difficulty Level: Medium/Hard
Size and Length of Project
Skills
Essential skills: LLM fine tuning, Experience with Graph databases, HTML, CSS, JS
Nice to have skills: C++
Public Repository
DBRetina Documentation
Neo4j Cypher Manual
LLM Fine-Tuning for KBQA
Potential Mentors
Mohamed Helmy
Tamer Mansour
The text was updated successfully, but these errors were encountered: