Address review feedback

cleanlab · Jun 27, 2024 · 7dc4671 · 7dc4671
1 parent 14fa388
commit 7dc4671
Show file tree

Hide file tree

Showing 2 changed files with 11 additions and 3 deletions.
diff --git a/README.md b/README.md
@@ -20,6 +20,7 @@ To quickly learn how to run cleanlab on your own data, first check out the [quic
 | 10   | [huggingface_keras_imdb](huggingface_keras_imdb/huggingface_keras_imdb.ipynb)                                             |  CleanLearning for text classification with Keras Model + pretrained BERT backbone and Tensorflow Dataset.         |
 | 11   | [fasttext_amazon_reviews](fasttext_amazon_reviews/fasttext_amazon_reviews.ipynb)                         | Finding label errors in Amazon Reviews text dataset using a cleanlab-compatible [FastText model](https://github.com/cleanlab/cleanlab/blob/master/cleanlab/models/fasttext.py).                                                                                                    |
 | 12   | [multiannotator_cifar10](multiannotator_cifar10/multiannotator_cifar10.ipynb)                                             | Iteratively improve consensus labels and trained classifier from data labeled by multiple annotators.                                                            |
+| 24 | [llm_evals_w_crowdlab](llm_evals_w_crowdlab/llm_evals_w_crowdlab.ipynb) | LLM Evals with humans, AI judges, and GPT token probabilities. Evaluate an LLM from multiple human/AI reviewers of varying competency by using CROWDLAB and GPT token probabilities.   |
 | 13  | [active_learning_multiannotator](active_learning_multiannotator/active_learning.ipynb)                                             | Improve a classifier model by iteratively collecting additional labels from data annotators. This active learning pipeline considers data labeled in batches by multiple (imperfect) annotators.                                                             |
 | 14  | [active_learning_single_annotator](active_learning_single_annotator/active_learning_single_annotator.ipynb)                                             | Improve a classifier model by iteratively labeling batches of currently-unlabeled data.  This demonstrates a standard active learning pipeline with *at most one label* collected for each example (unlike our multi-annotator active learning notebook which allows re-labeling).                                                            |
 | 15  | [active_learning_transformers](active_learning_transformers/active_learning.ipynb)                                             | Improve a Transformer model for classifying politeness of text by iteratively labeling and re-labeling batches of data using multiple annotators.  If you haven't done active learning with re-labeling, try the [active_learning_multiannotator](active_learning_multiannotator/active_learning.ipynb) notebook first.                                          |
@@ -31,7 +32,6 @@ To quickly learn how to run cleanlab on your own data, first check out the [quic
 | 21  | [non_iid_detection](non_iid_detection/non_iid_detection.ipynb)  | Use Datalab to detect non-IID sampling (e.g. drift) in datasets based on numeric features or embeddings. |
 | 22  | [object_detection](object_detection/README.md)  | Train Detectron2 object detection model for use with cleanlab. |
 | 23  | [semantic segmentation](segmentation/training_ResNeXt50_for_Semantic_Segmentation_on_SYNTHIA.ipynb)  | Train ResNeXt semantic segmentation model for use with cleanlab. |
-| 24 | [llm_evals_w_crowdlab](llm_evals_w_crowdlab/llm_evals_w_crowdlab.ipynb) | Uses GPT4o and CROWDLAB to evaluate language models on a dataset labeled by multiple annotators. |
 
 
 ## Instructions

diff --git a/llm_evals_w_crowdlab/llm_evals_w_crowdlab.ipynb b/llm_evals_w_crowdlab/llm_evals_w_crowdlab.ipynb
@@ -1,13 +1,21 @@
 {
   "cells": [
+    {
+      "cell_type": "markdown",
+      "id": "847d458d",
+      "metadata": {},
+      "source": [
+        "# LLM Evals with humans, AI judges, and GPT token probabilities."
+      ]
+    },
     {
       "cell_type": "markdown",
       "metadata": {
         "colab_type": "text",
         "id": "view-in-github"
       },
       "source": [
-        "<a href=\"https://colab.research.google.com/gist/nelsonauner/e81daa4c306ed111e2ed224b7cc715f2/cleanlab-crowdlab.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+        "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/cleanlab/examples/blob/master/llm_evals_w_crowdlab/llm_evals_w_crowdlab.ipynb)     "
       ]
     },
     {
@@ -19,7 +27,7 @@
       "source": [
         "# Step 1: Data Cleaning and Exploration\n",
         "\n",
-        "Let's get into it!"
+        "We'll install requirements, load the data, and explore it."
       ]
     },
     {