You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Issue 1: pull docker image on linux server ( No GPU mechine, Just for testing), and run source ~/argos-train-init will be success, but run command: argos-train, That will display: No command .... argos-train,
If run: bin/argos-train, will throw module not found error: ModuleNotFoundError: No module named 'argostrain'
Issue 2: Copy bin/argos-train script to parent path and run: ./argos-train, will be run and can input arguments, but will error occured:
From code (ISO 639): ja
To code (ISO 639): en
From name: Japanese
To name: English
Version: 2.2
cnkdj-jp-addr-ja_en
['\n', '\n', 'Data compiled by [Opus](https://opus.nlpl.eu/).\n', '\n', 'Dictionary data from Wiktionary using [Wiktextract](https://github.com/tatuylonen/wiktextract).\n', '\n', 'Includes pretrained models from [Stanza](https://github.com/stanfordnlp/stanza/).\n', '\n', 'Credits:\n', '\n']
Done splitting data
sentencepiece_trainer.cc(78) LOG(INFO) Starts training with :
trainer_spec {
input: run/split_data/all.txt
input_format:
model_prefix: run/sentencepiece
model_type: UNIGRAM
vocab_size: 50000
self_test_sample_size: 0
character_coverage: 0.9995
input_sentence_size: 1000000
shuffle_input_sentence: 1
seed_sentencepiece_size: 1000000
shrinking_factor: 0.75
max_sentence_length: 4192
num_threads: 16
num_sub_iterations: 2
max_sentencepiece_length: 16
split_by_unicode_script: 1
split_by_number: 1
split_by_whitespace: 1
split_digits: 0
pretokenization_delimiter:
treat_whitespace_as_suffix: 0
allow_whitespace_only_pieces: 0
required_chars:
byte_fallback: 0
vocabulary_output_piece_score: 1
train_extremely_large_corpus: 0
seed_sentencepieces_file:
hard_vocab_limit: 1
use_all_vocab: 0
unk_id: 0
bos_id: 1
eos_id: 2
pad_id: -1
unk_piece: <unk>
bos_piece: <s>
eos_piece: </s>
pad_piece: <pad>
unk_surface: ⁇
enable_differential_privacy: 0
differential_privacy_noise_level: 0
differential_privacy_clipping_threshold: 0
}
normalizer_spec {
name: nmt_nfkc
add_dummy_prefix: 1
remove_extra_whitespaces: 1
escape_whitespaces: 1
normalization_rule_tsv:
}
denormalizer_spec {}
trainer_interface.cc(353) LOG(INFO) SentenceIterator is not specified. Using MultiFileSentenceIterator.
trainer_interface.cc(185) LOG(INFO) Loading corpus: run/split_data/all.txt
trainer_interface.cc(409) LOG(INFO) Loaded all 873 sentences
trainer_interface.cc(425) LOG(INFO) Adding meta_piece: <unk>
trainer_interface.cc(425) LOG(INFO) Adding meta_piece: <s>
trainer_interface.cc(425) LOG(INFO) Adding meta_piece: </s>
trainer_interface.cc(430) LOG(INFO) Normalizing sentences...
trainer_interface.cc(539) LOG(INFO) all chars count=28413
trainer_interface.cc(550) LOG(INFO) Done: 99.9507% characters are covered.
trainer_interface.cc(560) LOG(INFO) Alphabet size=404
trainer_interface.cc(561) LOG(INFO) Final character coverage=0.999507
trainer_interface.cc(592) LOG(INFO) Done! preprocessed 873 sentences.
unigram_model_trainer.cc(265) LOG(INFO) Making suffix array...
unigram_model_trainer.cc(269) LOG(INFO) Extracting frequent sub strings... node_num=14236
unigram_model_trainer.cc(312) LOG(INFO) Initialized 2485 seed sentencepieces
trainer_interface.cc(598) LOG(INFO) Tokenizing input sentences with whitespace: 873
trainer_interface.cc(609) LOG(INFO) Done! 1143
unigram_model_trainer.cc(602) LOG(INFO) Using 1143 sentences for EM training
unigram_model_trainer.cc(618) LOG(INFO) EM sub_iter=0 size=1324 obj=19.6514 num_tokens=4413 num_tokens/piece=3.33308
unigram_model_trainer.cc(618) LOG(INFO) EM sub_iter=1 size=1239 obj=15.5659 num_tokens=4415 num_tokens/piece=3.56336
trainer_interface.cc(687) LOG(INFO) Saving model: run/sentencepiece.model
spm_train_main.cc(282) [_status.ok()] Internal: src/trainer_interface.cc(662) [(trainer_spec_.vocab_size()) == (model_proto->pieces_size())] Vocabulary size too high (50000). Please set it to a value <= 1309.
Program terminated with an unrecoverable error.
/home/argosopentech/OpenNMT-py/onmt/modules/sparse_activations.py:48: FutureWarning: `torch.cuda.amp.custom_fwd(args...)` is deprecated. Please use `torch.amp.custom_fwd(args..., device_type='cuda')` instead.
def forward(ctx, input, dim=0):
/home/argosopentech/OpenNMT-py/onmt/modules/sparse_activations.py:68: FutureWarning: `torch.cuda.amp.custom_bwd(args...)` is deprecated. Please use `torch.amp.custom_bwd(args..., device_type='cuda')` instead.
def backward(ctx, grad_output):
/home/argosopentech/OpenNMT-py/onmt/modules/sparse_losses.py:13: FutureWarning: `torch.cuda.amp.custom_fwd(args...)` is deprecated. Please use `torch.amp.custom_fwd(args..., device_type='cuda')` instead.
def forward(ctx, input, target):
/home/argosopentech/OpenNMT-py/onmt/modules/sparse_losses.py:37: FutureWarning: `torch.cuda.amp.custom_bwd(args...)` is deprecated. Please use `torch.amp.custom_bwd(args..., device_type='cuda')` instead.
def backward(ctx, grad_output):
/home/argosopentech/OpenNMT-py/onmt/models/sru.py:397: FutureWarning: `torch.cuda.amp.custom_fwd(args...)` is deprecated. Please use `torch.amp.custom_fwd(args..., device_type='cuda')` instead.
def forward(self, u, x, bias, init=None, mask_h=None):
/home/argosopentech/OpenNMT-py/onmt/models/sru.py:443: FutureWarning: `torch.cuda.amp.custom_bwd(args...)` is deprecated. Please use `torch.amp.custom_bwd(args..., device_type='cuda')` instead.
def backward(self, grad_h, grad_last):
Corpus corpus_1's weight should be given. We default it to 1 for you.
Traceback (most recent call last):
File "/home/argosopentech/env/bin/onmt_build_vocab", line 33, in <module>
sys.exit(load_entry_point('OpenNMT-py', 'console_scripts', 'onmt_build_vocab')())
File "/home/argosopentech/OpenNMT-py/onmt/bin/build_vocab.py", line 71, in main
build_vocab_main(opts)
File "/home/argosopentech/OpenNMT-py/onmt/bin/build_vocab.py", line 32, in build_vocab_main
transforms = make_transforms(opts, transforms_cls, fields)
File "/home/argosopentech/OpenNMT-py/onmt/transforms/transform.py", line 235, in make_transforms
transform_obj.warm_up(vocabs)
File "/home/argosopentech/OpenNMT-py/onmt/transforms/tokenize.py", line 147, in warm_up
load_src_model.Load(self.src_subword_model)
File "/home/argosopentech/env/lib/python3.10/site-packages/sentencepiece/__init__.py", line 961, in Load
return self.LoadFromFile(model_file)
File "/home/argosopentech/env/lib/python3.10/site-packages/sentencepiece/__init__.py", line 316, in LoadFromFile
return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
OSError: Not found: "run/sentencepiece.model": No such file or directory Error #2
/home/argosopentech/OpenNMT-py/onmt/modules/sparse_activations.py:48: FutureWarning: `torch.cuda.amp.custom_fwd(args...)` is deprecated. Please use `torch.amp.custom_fwd(args..., device_type='cuda')` instead.
def forward(ctx, input, dim=0):
/home/argosopentech/OpenNMT-py/onmt/modules/sparse_activations.py:68: FutureWarning: `torch.cuda.amp.custom_bwd(args...)` is deprecated. Please use `torch.amp.custom_bwd(args..., device_type='cuda')` instead.
def backward(ctx, grad_output):
/home/argosopentech/OpenNMT-py/onmt/modules/sparse_losses.py:13: FutureWarning: `torch.cuda.amp.custom_fwd(args...)` is deprecated. Please use `torch.amp.custom_fwd(args..., device_type='cuda')` instead.
def forward(ctx, input, target):
/home/argosopentech/OpenNMT-py/onmt/modules/sparse_losses.py:37: FutureWarning: `torch.cuda.amp.custom_bwd(args...)` is deprecated. Please use `torch.amp.custom_bwd(args..., device_type='cuda')` instead.
def backward(ctx, grad_output):
/home/argosopentech/OpenNMT-py/onmt/models/sru.py:397: FutureWarning: `torch.cuda.amp.custom_fwd(args...)` is deprecated. Please use `torch.amp.custom_fwd(args..., device_type='cuda')` instead.
def forward(self, u, x, bias, init=None, mask_h=None):
/home/argosopentech/OpenNMT-py/onmt/models/sru.py:443: FutureWarning: `torch.cuda.amp.custom_bwd(args...)` is deprecated. Please use `torch.amp.custom_bwd(args..., device_type='cuda')` instead.
def backward(self, grad_h, grad_last):
[2024-08-20 06:20:54,713 WARNING] Corpus corpus_1's weight should be given. We default it to 1 for you.
[2024-08-20 06:20:54,714 INFO] Parsed 2 corpora from -data.
Traceback (most recent call last):
File "/home/argosopentech/env/bin/onmt_train", line 33, in <module>
sys.exit(load_entry_point('OpenNMT-py', 'console_scripts', 'onmt_train')())
File "/home/argosopentech/OpenNMT-py/onmt/bin/train.py", line 172, in main
train(opt)
File "/home/argosopentech/OpenNMT-py/onmt/bin/train.py", line 106, in train
checkpoint, fields, transforms_cls = _init_train(opt)
File "/home/argosopentech/OpenNMT-py/onmt/bin/train.py", line 58, in _init_train
ArgumentParser.validate_prepare_opts(opt)
File "/home/argosopentech/OpenNMT-py/onmt/utils/parse.py", line 197, in validate_prepare_opts
cls._validate_fields_opts(opt, build_vocab_only=build_vocab_only)
File "/home/argosopentech/OpenNMT-py/onmt/utils/parse.py", line 151, in _validate_fields_opts
cls._validate_file(opt.src_vocab, info='src vocab')
File "/home/argosopentech/OpenNMT-py/onmt/utils/parse.py", line 18, in _validate_file
raise IOError(f"Please check path of your {info} file!")
OSError: Please check path of your src vocab file!
Traceback (most recent call last):
File "/home/argosopentech/argos-train/./argos-train", line 18, in <module>
train.train(from_code, to_code, from_name, to_name, version, package_version, argos_version, data_exists)
File "/home/argosopentech/argos-train/argostrain/train.py", line 163, in train
str(opennmt_checkpoints[-2].f),
IndexError: list index out of range
What the list index out of range? I prepare the training data total 2436 rows.
Please help me to resolve the error.
Thanks.
The text was updated successfully, but these errors were encountered:
Issue 1: pull docker image on linux server ( No GPU mechine, Just for testing), and run
source ~/argos-train-init
will be success, but run command: argos-train, That will display: No command .... argos-train,If run: bin/argos-train, will throw module not found error: ModuleNotFoundError: No module named 'argostrain'
Issue 2: Copy bin/argos-train script to parent path and run:
./argos-train
, will be run and can input arguments, but will error occured:What the list index out of range? I prepare the training data total 2436 rows.
Please help me to resolve the error.
Thanks.
The text was updated successfully, but these errors were encountered: