Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
Guitaricet authored Jan 4, 2024
1 parent c5094fc commit b7b7c58
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,9 @@ Pre-process data (might take some time)
```bash
python pretokenize.py \
--save_dir preprocessed_data \
--tokenizer <HF tokenizer name or path> \
--dataset <HF dataset id> \
--dataset_config <DatasetConfig> \
--tokenizer t5-base \
--dataset c4 \
--dataset_config en \
--text_field text \
--sequence_length 512
```
Expand Down

0 comments on commit b7b7c58

Please sign in to comment.