WMT Collecting Translations

This tool is used to collect translations from various providers and LLMs that may be used at WMT General MT.

Usage

The tool is using 3-shot prompting when translating with LLMs.

Setting up secrets

You need to set one or multiple following secrets for the full utilization:

export MTAPI_SUBSCRIPTION_KEY=          # Microsoft Azure API key
export GOOGLE_APPLICATION_CREDENTIALS=  # Google credentials in json file
export DEEPL_PRO_AUTH_KEY=              # DeepL credentials
export YANDEX_APPLICATION_CREDENTIALS=  # Yandex API key
export TOGETHER_API_KEY=                # Together API key for LLama 3
export COHERE_API_KEY=                  # Cohere key
export OPENAI_AZURE_ENDPOINT=           # OpenAI Azure endpoint URL
export OPENAI_AZURE_KEY=                # OpenAI Azure key
export MISTRAL_KEY=                     # Mistral API key
export GEMINI_API_KEY=                  # Gemini API key for Google AI Studio
export ANTHROPIC_API_KEY=               # Anthropic key for claude
export PHI_API_KEY=                     # API key for Phi model

Download WMT testsets

Download latest blindset and rename to wmt_testset:

wget https://www2.statmt.org/wmt24/WMT24_GeneralMT.zip
unzip WMT24_GeneralMT.zip
mv WMT24_GeneralMT wmt_testset

Extract XML into txt, extract also without testsuites:

for file in `ls wmt_testset/*.xml --color=no`; do cat $file | wmt-unwrap -o $file.full; done

for file in `ls wmt_testset/*.xml --color=no`; do cat $file | wmt-unwrap -o $file.no-testsuites --no-testsuites; done

Running translations

python main.py --system='SYSTEM'

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
few_shots		few_shots
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
calculate_tokens.py		calculate_tokens.py
convert_to_submissions.sh		convert_to_submissions.sh
eval.py		eval.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WMT Collecting Translations

Usage

Setting up secrets

Download WMT testsets

Running translations

About

Releases

Packages

Languages

License

wmt-conference/wmt-collect-translations

Folders and files

Latest commit

History

Repository files navigation

WMT Collecting Translations

Usage

Setting up secrets

Download WMT testsets

Running translations

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages