Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DISCO-3210] [load test: abort] Use async gcs client for manifest #793

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

Herraj
Copy link
Contributor

@Herraj Herraj commented Feb 13, 2025

References

JIRA: DISCO-3210

Description

PR Review Checklist

Put an x in the boxes that apply

  • This PR conforms to the Contribution Guidelines
  • The PR title starts with the JIRA issue reference, format example [DISCO-####], and has the same title (if applicable)
  • [load test: (abort|skip|warn)] keywords are applied to the last commit message (if applicable)
  • Documentation has been updated (if applicable)
  • Functional and performance test coverage has been expanded and maintained (if applicable)

@Herraj Herraj self-assigned this Feb 13, 2025
return GetManifestResultCode.FAIL, None
except ValidationError as val_err:
logger.error(f"Invalid manifest content: {val_err}")
return GetManifestResultCode.FAIL, None
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you can see I've trimmed out a lot of nested logic here but more importantly, I've gotten rid of the SKIP branch.

I'm trying to understand if we really need it? We return the same None data as we do with a FAIL but only the code is different. So logically, I'm seeing it as just another FAIL 🤔


case GetManifestResultCode.SKIP:
logger.info("Manifest data was not updated (SKIP).")
return result_code, None
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to the above comment, we aren't doing much here with SKIP except have a different log.

@@ -75,12 +75,9 @@ async def _fetch_data(self) -> None:
}
self.last_fetch_at = time.time()

case GetManifestResultCode.SKIP:
return None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above two comments!

self.blob_name = blob_name
self.bucket = Bucket(storage=self.gcs_client, name=gcs_bucket_path)

# TODO figure out this
self.blob_generation = 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to understand how this helps us? Do we still need it with the async logic? This ties into the SKIP logic for which I've added comments below!

Copy link
Contributor

@misaniwere misaniwere Feb 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah we should keep the SKIP because when we call get_blob, we can add a clause bucket.get_blob(blob_name, if_generation_not_match=blob_generation) so we're not fetching the same file if the version hasn't changed. it's a slight performance improvement

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the SKIP code exists to catch cases where we didn't fetch a file because we didn't have an updated file to fetch.

@@ -30,27 +29,22 @@ async def fetch(self) -> tuple[GetManifestResultCode, ManifestData | None]:
(SKIP, None): If there's no new generation (blob unchanged).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably update the docstring for this method

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we even need this method since all it does now is just call fetch_manifest_data. We could just take this out and rename the below method to fetch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants