Update MODEL_CARD.md (#5)

Summary: Updated the paper link and fixed the image link. Pull Request resolved: #5 Reviewed By: jspisak Differential Revision: D51983508 Pulled By: spencerwmeta fbshipit-source-id: b08c82335b9e49c0721b2d32d48fc326f0b0fdc1
meta-llama · Dec 8, 2023 · 9fe9abf · 9fe9abf
1 parent 2ab0f59
commit 9fe9abf
Showing 1 changed file with 4 additions and 2 deletions.
diff --git a/Llama-Guard/MODEL_CARD.md b/Llama-Guard/MODEL_CARD.md
@@ -8,7 +8,9 @@ It acts as an LLM: it generates text in its output that indicates whether a
 given prompt or response is safe/unsafe, and if unsafe based on a policy, it
 also lists the violating subcategories. Here is an example:
 
-![](Llama Guard_example.png)
+<p align="center">
+  <img src="https://github.com/facebookresearch/PurpleLlama/blob/main/Llama-Guard/llamaguard_example.png" width="800"/>
+</p>
 
 In order to produce classifier scores, we look at the probability for the first
 token, and turn that into an “unsafe” class probability. Model users can then
@@ -96,7 +98,7 @@ include [ToxicChat](https://huggingface.co/datasets/lmsys/toxic-chat) and
 
 Note: comparisons are not exactly apples-to-apples due to mismatches in each
 taxonomy. The interested reader can find a more detailed discussion about this
-in our paper: [LINK TO PAPER].
+in our [paper](https://ai.meta.com/research/publications/llama-guard-llm-based-input-output-safeguard-for-human-ai-conversations/).
 
 |                 | Our Test Set (Prompt) | OpenAI Mod | ToxicChat | Our Test Set (Response) |
 | --------------- | --------------------- | ---------- | --------- | ----------------------- |