Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Network hiccups using Retry logic #306

Closed
wants to merge 1 commit into from
Closed

Conversation

rd4398
Copy link
Contributor

@rd4398 rd4398 commented Aug 7, 2024

Fixes: #222
As per discussion here: #228

@rd4398 rd4398 requested a review from dhellmann August 7, 2024 18:33
@mergify mergify bot added the ci label Aug 7, 2024
@@ -311,7 +311,7 @@ def download_wheel(
wheel_filename = output_directory / os.path.basename(urlparse(wheel_url).path)
if not wheel_filename.exists():
logger.info(f"{req.name}: downloading pre-built wheel {wheel_url}")
wheel_filename = _download_wheel_check(output_directory, wheel_url)
wheel_filename = _download_wheel_check(output_directory, wheel_url, ctx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have the context as the first argument to functions everywhere else, could you follow that pattern here, too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, done!

@@ -165,7 +164,7 @@ def default_download_source(
)

source_filename = _download_source_check(
ctx.sdists_downloads, url, destination_filename
ctx.sdists_downloads, url, ctx, destination_filename
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about argument ordering.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, done!

@@ -213,7 +218,7 @@ def download_url(
return outfile
# Open the URL first in case that fails, so we don't end up with an empty file.
logger.debug(f"reading from {url}")
with requests.get(url, stream=True) as r:
with ctx.requests.get(url, stream=True) as r:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does retrying work with streaming? If it starts over, the caller needs to understand that so it doesn't just write duplicate copies of the data to the same file, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, need to look at this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://gist.github.com/tobiasraabe/58adee67de619ce621464c1a6511d7d9 looks potentially useful. The Range HTTP header tells the server where to start sending data, based on how much has already been downloaded. https://medium.com/@lope.ai/downloading-files-over-http-with-python-requests-e12e6b795e43 shows another example of tracking that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed
I have tried to use Retry header in the latest version of this PR

# Open the URL first in case that fails, so we don't end up with an empty file.
logger.debug(f"reading from {url}")
with requests.get(url, stream=True) as r:
with ctx.requests.get(url, stream=True, headers=headers) as r:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're going to need a loop here to handle the retry.

Let's set this aside for now and come back when we have more time to work on it together.

@shubhbapna
Copy link
Collaborator

Converting to draft based on the last review comment

@shubhbapna shubhbapna marked this pull request as draft August 14, 2024 17:46
@shubhbapna shubhbapna closed this Oct 11, 2024
@shubhbapna shubhbapna deleted the network-hiccups branch October 11, 2024 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make fromager more resilient against network hickups
3 participants