Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unexpected keyword argument tokenizer [FIXED] #1285

Open
avemio-digital opened this issue Nov 13, 2024 · 4 comments
Open

unexpected keyword argument tokenizer [FIXED] #1285

avemio-digital opened this issue Nov 13, 2024 · 4 comments
Labels
fixed - pending confirmation Fixed, waiting for confirmation from poster

Comments

@avemio-digital
Copy link

I used orpo colab example for mistral model and I am getting this error. I am using below configs

from trl import ORPOConfig, ORPOTrainer
from unsloth import is_bfloat16_supported

orpo_trainer = ORPOTrainer(
model = model,
train_dataset = dataset,
tokenizer = tokenizer,
args = ORPOConfig(
max_length = max_seq_length,
max_prompt_length = max_seq_length//2,
max_completion_length = max_seq_length//2,
per_device_train_batch_size = 2,
gradient_accumulation_steps = 4,
beta = 0.1,
logging_steps = 1,
optim = "adamw_8bit",
lr_scheduler_type = "linear",
max_steps = 1500, # Change to num_train_epochs = 1 for full training runs
fp16 = not is_bfloat16_supported(),
bf16 = is_bfloat16_supported(),
output_dir = "outputs",
report_to = "none", # Use this for WandB etc
),
)

@dame-cell
Copy link

dame-cell commented Nov 13, 2024

So unsloth uses the newest version of trl always and just recently trl just updated and remove the tokenizer parameter to processing_class mentioned here 2348
you can simply do this

- trainer = DPOTrainer(model, args=training_args, train_dataset=dataset, tokenizer=tokenizer)
+ trainer = DPOTrainer(model, args=training_args, train_dataset=dataset, processing_class=tokenizer)

@Erland366
Copy link
Contributor

Unsloth will make backward compability I think

@danielhanchen danielhanchen changed the title ORPOTrainer.__init__() got an unexpected keyword argument 'tokenizer' [FIXED] unexpected keyword argument tokenizer Nov 14, 2024
@danielhanchen danielhanchen added the fixed - pending confirmation Fixed, waiting for confirmation from poster label Nov 14, 2024
@danielhanchen
Copy link
Contributor

Just fixed @avemio-digital @dame-cell ! Please update Unsloth on local machines via pip install --upgrade --no-cache-dir --no-deps unsloth. For Colab, Kaggle, just refresh!

@danielhanchen danielhanchen pinned this issue Nov 14, 2024
@danielhanchen danielhanchen changed the title [FIXED] unexpected keyword argument tokenizer unexpected keyword argument tokenizer [FIXED] Nov 14, 2024
@qgallouedec
Copy link

This has been fixed in trl v0.12.1, make sure to run

pip install --upgrade trl

please also note that tokenizer is deprecated, so you should prefer processing_class, as mentioned by @dame-cell in #1285 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fixed - pending confirmation Fixed, waiting for confirmation from poster
Projects
None yet
Development

No branches or pull requests

5 participants