LLM as an Evaluator

Overview

This project leverages large language models (LLMs) to evaluate biomedical question-answering datasets. The primary dataset used is MEDIQA. The goal is to provide a robust evaluation framework for biomedical QA systems.

File Structure

/llm-as-a-evaluator
├── notebooks
│   ├── data_preprocessing.py
│   └── results.ipynb
├── src
│   └── app.py
├── README.md
├── requirements.txt

Setup

Prerequisites

Python 3.8+
pip

Installation

Clone the repository:

git clone https://github.com/yourusername/llm-as-a-evaluator.git
cd llm-as-a-evaluator

Install the required packages:
```
pip install -r requirements.txt
```
Download the MEDIQA dataset and place it in the data/mediqa directory.

API Key

To use the LLMs, you need to obtain an API key from Groq. Groq offers free API keys. Set the API key as an environment variable or in a .env file.

Obtaining Groq API Key

Visit the Groq Console.
Sign in or create an account.
Navigate to the API Keys section.
Generate a new API key and copy it.

Set the API key as an environment variable:

export GROQ_API_KEY=your_groq_api_key_here

Alternatively, create a .env file in the project root and add the API key:
```
GROQ_API_KEY=your_groq_api_key_here
```

Features

Data Preprocessing: Scripts to preprocess the MEDIQA dataset.
Model Evaluation: Tools to evaluate QA models using LLMs.
Analysis Notebooks: Jupyter notebooks for detailed analysis.

License

This project is licensed under the MIT License - see the MIT License file for details.

Additional Information

For more details on the MEDIQA dataset, visit the official repository.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM as an Evaluator

Overview

File Structure

Setup

Prerequisites

Installation

API Key

Obtaining Groq API Key

Features

License

Additional Information

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
notebooks		notebooks
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

aaqib-ahmed-nazir/llm-as-a-evaluator

Folders and files

Latest commit

History

Repository files navigation

LLM as an Evaluator

Overview

File Structure

Setup

Prerequisites

Installation

API Key

Obtaining Groq API Key

Features

License

Additional Information

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages