You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The transformers library we use automatically detects if clearml is installed, and then registers a ClearMLCallback object into its event handling system which sends logs to clearml.
This happens even if you're training models locally and not using clearml in any way.
There's two potential issues:
I am not sure if the team is aware that this kind of logging is happening and what implications come from that
people wanting to run models locally can't do this if their local has clearml installed but not configured correctly
It auto-magically detects if common frameworks are installed and then registers callbacks for them.
In the case of clearml, I suspect it's detecting if clearml is installed by looking for a clearml python package on the python path. We do have this library installed by poetry:
If my theory is right, then this would impact anyone using poetry.
Example error
I noticed this when trying to run the model training script locally. I haven't yet setup clearml but I have installed poetry.
$ export SIL_NLP_DATA_PATH=~/sil/tasks/2025-01-23-local-training/NLP
$ poetry run python -m silnlp.nmt.train Philippines/ABP/2025-01-30-Exp02-isolate-luke-for-testing
...
Traceback (most recent call last):
File "/home/me/.miniconda3/envs/silnlp/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/me/.miniconda3/envs/silnlp/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/me/sil/dev/silnlp/silnlp/nmt/train.py", line 42, in <module>
main()
File "/home/me/sil/dev/silnlp/silnlp/nmt/train.py", line 34, in main
model.train()
File "/home/me/sil/dev/silnlp/silnlp/nmt/hugging_face_config.py", line 1026, in train
train_result = trainer.train(resume_from_checkpoint=last_checkpoint)
File "/home/me/.miniconda3/envs/silnlp/lib/python3.10/site-packages/transformers/trainer.py", line 2123, in train
return inner_training_loop(
File "/home/me/sil/dev/silnlp/silnlp/nmt/hugging_face_config.py", line 1904, in _inner_training_loop
return inner_training_loop(
File "/home/me/sil/dev/silnlp/silnlp/nmt/hugging_face_config.py", line 1984, in decorator
return function(batch_size, *args, **kwargs)
File "/home/me/.miniconda3/envs/silnlp/lib/python3.10/site-packages/transformers/trainer.py", line 2382, in _inner_training_loop
self.control = self.callback_handler.on_train_begin(args, self.state, self.control)
File "/home/me/.miniconda3/envs/silnlp/lib/python3.10/site-packages/transformers/trainer_callback.py", line 468, in on_train_begin
return self.call_event("on_train_begin", args, state, control)
File "/home/me/.miniconda3/envs/silnlp/lib/python3.10/site-packages/transformers/trainer_callback.py", line 518, in call_event
result = getattr(callback, event)(
File "/home/me/.miniconda3/envs/silnlp/lib/python3.10/site-packages/transformers/integrations/integration_utils.py", line 1869, in on_train_begin
self.setup(args, state, model, tokenizer, **kwargs)
File "/home/me/.miniconda3/envs/silnlp/lib/python3.10/site-packages/transformers/integrations/integration_utils.py", line 1792, in setup
self._clearml_task = self._clearml.Task.init(
File "/home/me/.miniconda3/envs/silnlp/lib/python3.10/site-packages/clearml/task.py", line 596, in init
task = cls._create_dev_task(
File "/home/me/.miniconda3/envs/silnlp/lib/python3.10/site-packages/clearml/task.py", line 3956, in _create_dev_task
task = cls(
File "/home/me/.miniconda3/envs/silnlp/lib/python3.10/site-packages/clearml/task.py", line 211, in __init__
super(Task, self).__init__(**kwargs)
File "/home/me/.miniconda3/envs/silnlp/lib/python3.10/site-packages/clearml/backend_interface/task/task.py", line 167, in __init__
super(Task, self).__init__(id=task_id, session=session, log=log)
File "/home/me/.miniconda3/envs/silnlp/lib/python3.10/site-packages/clearml/backend_interface/base.py", line 149, in __init__
super(IdObjectBase, self).__init__(session, log, **kwargs)
File "/home/me/.miniconda3/envs/silnlp/lib/python3.10/site-packages/clearml/backend_interface/base.py", line 41, in __init__
self._session = session or self._get_default_session()
File "/home/me/.miniconda3/envs/silnlp/lib/python3.10/site-packages/clearml/backend_interface/base.py", line 119, in _get_default_session
InterfaceBase._default_session = Session(
File "/home/me/.miniconda3/envs/silnlp/lib/python3.10/site-packages/clearml/backend_api/session/session.py", line 161, in __init__
self._connect()
File "/home/me/.miniconda3/envs/silnlp/lib/python3.10/site-packages/clearml/backend_api/session/session.py", line 224, in _connect
raise MissingConfigError()
clearml.backend_api.session.defs.MissingConfigError: It seems ClearML is not configured on this machine!
To get started with ClearML, setup your own 'clearml-server' or create a free account at https://app.clear.ml
Setup instructions can be found here: https://clear.ml/docs
My hack
I got around this by overriding the "report_to" setting:
def_create_training_arguments(self) ->Seq2SeqTrainingArguments:
parser=HfArgumentParser(Seq2SeqTrainingArguments)
args: dict= {}
...
# Temp hack to stop it trying to log to clearml when I'm running locallyargs["report_to"] = [] # <----------------------------------returnparser.parse_dict(args)[0]
Ideal behavior
From my brief chat with David, he suggested that the ideal behavior is for this logging to be controlled by the inputs to the program, not by what happens to be installed on the system. That makes it more deterministic and portable.
The text was updated successfully, but these errors were encountered:
Bug Summary
The transformers library we use automatically detects if clearml is installed, and then registers a ClearMLCallback object into its event handling system which sends logs to clearml.
This happens even if you're training models locally and not using clearml in any way.
There's two potential issues:
Details
The hugging face transformer callback docs outline how the training loop supports registering callbacks to execute at different points in the training.
The important part is:
It auto-magically detects if common frameworks are installed and then registers callbacks for them.
In the case of clearml, I suspect it's detecting if clearml is installed by looking for a clearml python package on the python path. We do have this library installed by poetry:
If my theory is right, then this would impact anyone using poetry.
Example error
I noticed this when trying to run the model training script locally. I haven't yet setup clearml but I have installed poetry.
My hack
I got around this by overriding the "report_to" setting:
Ideal behavior
From my brief chat with David, he suggested that the ideal behavior is for this logging to be controlled by the inputs to the program, not by what happens to be installed on the system. That makes it more deterministic and portable.
The text was updated successfully, but these errors were encountered: