Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky Test: TestDatasetIntegration.test_dataset_download[HuggingFace - Public dataset-huggingface-test_case0] #2460

Open
tenzen-y opened this issue Feb 28, 2025 · 7 comments

Comments

@tenzen-y
Copy link
Member

What happened?

Flaky Integration Test: TestDatasetIntegration.test_dataset_download[HuggingFace - Public dataset-huggingface-test_case0]

What did you expect to happen?

Never failed.

Environment

Kubernetes version:

$ kubectl version

Kubeflow Trainer version:

$ kubectl get pods -n kubeflow -l app.kubernetes.io/name=trainer -o jsonpath="{.items[*].spec.containers[*].image}"

Kubeflow Python SDK version:

$ pip show kubeflow

Impacted by this bug?

Give it a 👍 We prioritize the issues with most 👍

@andreyvelich
Copy link
Member

cc @seanlaii Please can you take when you can ?

@andreyvelich
Copy link
Member

/area testing

@andreyvelich
Copy link
Member

/good-first-issue

Copy link

@andreyvelich:
This request has been marked as suitable for new contributors.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-good-first-issue command.

In response to this:

/good-first-issue

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@seanlaii
Copy link
Contributor

seanlaii commented Mar 3, 2025

Yes, will check tonight.

@seanlaii
Copy link
Contributor

seanlaii commented Mar 3, 2025

/assign

@seanlaii
Copy link
Contributor

seanlaii commented Mar 4, 2025

I am not able to reproduce it locally.
These are the error messages:

E           requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://huggingface.co/api/datasets/karpathy/tiny_shakespeare/revision/main
E               huggingface_hub.errors.HfHubHTTPError: 403 Forbidden: None.
E               Cannot access content at: https://huggingface.co/api/datasets/karpathy/tiny_shakespeare/revision/main.
E               Make sure your token has the correct permissions.
E               huggingface_hub.errors.HfHubHTTPError: 403 Forbidden: None.
E               Cannot access content at: https://huggingface.co/api/models/hf-internal-testing/tiny-random-bert/revision/main.
E               Make sure your token has the correct permissions.

However, the dataset and model are all public without gated: https://huggingface.co/api/datasets/karpathy/tiny_shakespeare/revision/main
https://huggingface.co/api/models/hf-internal-testing/tiny-random-bert/revision/main

I will spend more time looking into it. Not sure if it is related to connection issue:

else:
                # Otherwise: most likely a connection issue or Hub downtime => let's warn the user
>               raise LocalEntryNotFoundError(
                    "An error happened while trying to locate the files on the Hub and we cannot find the appropriate"
                    " snapshot folder for the specified revision on the local disk. Please check your internet connection"
                    " and try again."
                ) from api_call_error
E               huggingface_hub.errors.LocalEntryNotFoundError: An error happened while trying to locate the files on the Hub and we cannot find the appropriate snapshot folder for the specified revision on the local disk. Please check your internet connection and try again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants