Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running generation batch misses file #21

Open
PelzKo opened this issue Jun 22, 2023 · 4 comments
Open

Running generation batch misses file #21

PelzKo opened this issue Jun 22, 2023 · 4 comments

Comments

@PelzKo
Copy link

PelzKo commented Jun 22, 2023

The run_generation_batch.py in finetune/textgen/gpt2 imports stuff from train_control. Now that is not an open package, but another file from https://github.com/XiangLi1999/PrefixTuning/tree/cleaned/gpt2 which you need to add to the folder

@PelzKo
Copy link
Author

PelzKo commented Jun 22, 2023

Also in the same file I have not been able to figure out what from utils import calculate_rouge, chunks, parse_numeric_n_bool_cl_kwargs, use_task_specific_params refers to, thus I had to just comment it out

@J38
Copy link
Contributor

J38 commented Jun 23, 2023

Could you give me some more details about what you're trying to do? I am planning on pushing an updated version of this code with clear examples for training and generating responses for the MeQSum task (which is a good demo task for prompt --> response) and can quickly be adapted to other tasks.

@PelzKo
Copy link
Author

PelzKo commented Jun 23, 2023

I am trying to extract a diagnoses of an image from the Title, referencing paragraph and image caption of a scientific paper (so in a broader sense this is also a summarization problem). For that I have been using your finetune and evaluation script to run it:
Training:
torchrun --nproc_per_node=1 --nnodes=1 --node_rank=0 finetune_for_summarization.py --output_dir out --model_name_or_path stanford-crfm/BioMedLM --tokenizer_name stanford-crfm/pubmed_gpt_tokenizer --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --save_strategy no --do_eval --train_data_file ~/data/t2i/data/llm/without_nan/train.source --eval_data_file ~/data/t2i/data/llm/without_nan/val.source --save_total_limit 2 --overwrite_output_dir --gradient_accumulation_steps 1 --learning_rate 1.6e-4 --warmup_ratio 0.5 --weight_decay 0.0 --seed 11 --evaluation_strategy steps --eval_steps 200 --bf16 --num_train_epochs 10 --logging_steps 100 --logging_first_step
Evaluation:
CUDA_VISIBLE_DEVICES=0 python -u run_generation_batch.py --fp16 --max_source_length -1 --length 400 --model_name_or_path=out --num_return_sequences 5 --stop_token [SEP] --tokenizer_name=stanford-crfm/pubmed_gpt_tokenizer --task_mode=meqsum --control_mode=no --tuning_mode finetune --gen_dir generated_results --batch_size 9 --temperature 1.0 --no_repeat_ngram_size 6 --length_penalty -0.5 --wandb_entity=None --wandb_project=None --wandb_run_name=None

@PelzKo
Copy link
Author

PelzKo commented Jun 23, 2023

For it to run I had to copy the script I referred to earlier and remove the import line I specified

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants