From efa1b23caaedaff4df06ba97bb69ef7cefc8b1db Mon Sep 17 00:00:00 2001 From: Sweta Date: Sun, 5 May 2024 17:15:17 +0000 Subject: [PATCH 1/3] Update CLI Usage + News --- README.md | 10 +++++++++- comet/models/base.py | 2 +- 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index a645e83..e6bf886 100644 --- a/README.md +++ b/README.md @@ -12,6 +12,7 @@ 1) [AfriCOMET](https://arxiv.org/pdf/2311.09828.pdf) released, a new model to embrace under-resourced African Languages. 2) We released our new eXplainable COMET models ([XCOMET-XL](https://huggingface.co/Unbabel/XCOMET-XL) and [-XXL](https://huggingface.co/Unbabel/XCOMET-XXL)) which along with quality scores detects which errors in the translation are minor, major or critical according to MQM typology 3) We release [CometKiwi -XL (3.5B)](https://huggingface.co/Unbabel/wmt23-cometkiwi-da-xl) and [-XXL (10.7B)](https://huggingface.co/Unbabel/wmt23-cometkiwi-da-xxl) QE models. These models were the best performing QE models on the WMT23 QE shared task. +4) We now support [DocCOMET](https://statmt.org/wmt22/pdf/2022.wmt-1.6.pdf), a document-level extension of COMET which can utilize contextual information. Using context improves accuracy on discourse phenomena tasks as well as referenceless evaluation of [chat translation quality](https://arxiv.org/pdf/2403.08314). Please check all available models [here](https://github.com/Unbabel/COMET/blob/master/MODELS.md) @@ -77,10 +78,17 @@ WMT test sets via [SacreBLEU](https://github.com/mjpost/sacrebleu): comet-score -d wmt22:en-de -t PATH/TO/TRANSLATIONS ``` +Scoring with context: +```bash +echo -e "Pies made from apples like these. [SEP] Oh, they do look delicious.\nOh, they do look delicious." >> src.txt +echo -e "Des tartes faites avec des pommes comme celles-ci. [SEP] Elles ont l’air delicieux.\nElles ont l’air delicieux" >> hyp1.txt +echo -e "Des tartes faites avec des pommes comme celles-ci. [SEP] Ils ont l’air delicieux.\nIls ont l’air delicieux." >> hyp2.txt +``` + If you are only interested in a system-level score use the following command: ```bash -comet-score -s src.txt -t hyp1.txt -r ref.txt --quiet --only_system +comet-score -s src.txt -t hyp1.txt hyp2.txt --model Unbabel/wmt20-comet-qe-da --enable-context ``` ### Reference-free evaluation: diff --git a/comet/models/base.py b/comet/models/base.py index 003a5c5..0fada9d 100644 --- a/comet/models/base.py +++ b/comet/models/base.py @@ -160,7 +160,7 @@ def set_mc_dropout(self, value: int): def enable_context(self): """Function that extends COMET to use preceding context as described in https://statmt.org/wmt22/pdf/2022.wmt-1.6.pdf.""" - logger.warning("Context can only be enabled for RegressionMetric with Average Pooling.") + logger.warning("Context should only be enabled for RegressionMetric with Average Pooling.") @abc.abstractmethod def read_training_data(self) -> List[dict]: From a6fa91a36b05974aec2a326019ae3e5bee28c58c Mon Sep 17 00:00:00 2001 From: Sweta Date: Mon, 6 May 2024 06:56:25 +0000 Subject: [PATCH 2/3] Updated Readme.md --- README.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index e6bf886..6030680 100644 --- a/README.md +++ b/README.md @@ -85,10 +85,14 @@ echo -e "Des tartes faites avec des pommes comme celles-ci. [SEP] Elles ont l’ echo -e "Des tartes faites avec des pommes comme celles-ci. [SEP] Ils ont l’air delicieux.\nIls ont l’air delicieux." >> hyp2.txt ``` +```bash +comet-score -s src.txt -t hyp1.txt hyp2.txt --model Unbabel/wmt20-comet-qe-da --enable-context +``` + If you are only interested in a system-level score use the following command: ```bash -comet-score -s src.txt -t hyp1.txt hyp2.txt --model Unbabel/wmt20-comet-qe-da --enable-context +comet-score -s src.txt -t hyp1.txt -r ref.txt --quiet --only_system ``` ### Reference-free evaluation: From 680f1b5eac18edbc32ec841e2ace4cbbeb95d1f5 Mon Sep 17 00:00:00 2001 From: Sweta Date: Mon, 6 May 2024 15:55:32 +0000 Subject: [PATCH 3/3] Updated example to use instead of [SEP] --- README.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 6030680..e8b0740 100644 --- a/README.md +++ b/README.md @@ -80,11 +80,13 @@ comet-score -d wmt22:en-de -t PATH/TO/TRANSLATIONS Scoring with context: ```bash -echo -e "Pies made from apples like these. [SEP] Oh, they do look delicious.\nOh, they do look delicious." >> src.txt -echo -e "Des tartes faites avec des pommes comme celles-ci. [SEP] Elles ont l’air delicieux.\nElles ont l’air delicieux" >> hyp1.txt -echo -e "Des tartes faites avec des pommes comme celles-ci. [SEP] Ils ont l’air delicieux.\nIls ont l’air delicieux." >> hyp2.txt +echo -e "Pies made from apples like these. Oh, they do look delicious.\nOh, they do look delicious." >> src.txt +echo -e "Des tartes faites avec des pommes comme celles-ci. Elles ont l’air delicieux.\nElles ont l’air delicieux" >> hyp1.txt +echo -e "Des tartes faites avec des pommes comme celles-ci. Ils ont l’air delicieux.\nIls ont l’air delicieux." >> hyp2.txt ``` +where `` is the separator token of the specific tokenizer (here: `xlm-roberta-large`) that the underlying model uses. + ```bash comet-score -s src.txt -t hyp1.txt hyp2.txt --model Unbabel/wmt20-comet-qe-da --enable-context ```