Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOP-22144] Change logic of applying limits #327

Merged
merged 1 commit into from
Jan 13, 2025
Merged

Conversation

dolfinus
Copy link
Member

@dolfinus dolfinus commented Jan 13, 2025

Change Summary

After thinking on #326 (review), I've desided that FileConnection.walk and FileConnection.listdir implementation should be changed.

Previously: iterate over all files in the directory, include file to resulting list, stop if limit is reached.
Now: iterate over all files in the directory, check if limit is reached, include file only if it is not.

This is because if user have files list:

  • file1.txt (10KB)
  • file2.txt (20KB)
  • file3.txt (30KB)

and set limit TotalFileSize("15KB"), FileDownloader/FileMover should download only file1.txt and skip both file2.txt and file3.txt (because 10KB+20KB already exceeds the limit of 15KB). Previous implementation lead to downloading file2.txt (as limit internal stage recorded 10KB which is less than the limit), and skipping only file3.txt.

This is a possibly breaking change for developers of custom file limits, as now .is_reached should return True after limit is reached, and not before.

Related issue number

Checklist

  • Commit message and PR title is comprehensive
  • Keep the change as small as possible
  • Unit and integration tests for the changes exist
  • Tests pass on CI and coverage does not decrease
  • Documentation reflects the changes where applicable
  • docs/changelog/next_release/<pull request or issue id>.<change type>.rst file added describing change
    (see CONTRIBUTING.rst for details.)
  • My PR is ready to review.

@dolfinus dolfinus self-assigned this Jan 13, 2025
@dolfinus dolfinus changed the title [DOP-22144] Change logic of applying limits in FileConnection.walk [DOP-22144] Change logic of applying limits Jan 13, 2025
@dolfinus dolfinus marked this pull request as ready for review January 13, 2025 14:36
Copy link

codecov bot commented Jan 13, 2025

Codecov Report

Attention: Patch coverage is 72.72727% with 3 lines in your changes missing coverage. Please review.

Project coverage is 91.85%. Comparing base (20eb931) to head (d080c9e).
Report is 1 commits behind head on develop.

Files with missing lines Patch % Lines
...netl/connection/file_connection/file_connection.py 62.50% 1 Missing and 2 partials ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##           develop     #327   +/-   ##
========================================
  Coverage    91.84%   91.85%           
========================================
  Files          227      227           
  Lines         9786     9784    -2     
  Branches       999      998    -1     
========================================
- Hits          8988     8987    -1     
  Misses         605      605           
+ Partials       193      192    -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@dolfinus dolfinus merged commit 8286d0f into develop Jan 13, 2025
38 of 39 checks passed
@dolfinus dolfinus deleted the feature/DOP-22144 branch January 13, 2025 15:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants