Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hardness benchmark #440

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Hardness benchmark #440

wants to merge 5 commits into from

Conversation

ritalyu17
Copy link

@ritalyu17 ritalyu17 commented Dec 3, 2024

Work in progress Integrated Hardness benchmarking task.

To-do:

  • replace the dataset

@CLAassistant
Copy link

CLAassistant commented Dec 3, 2024

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ ritalyu17
❌ pre-commit-ci[bot]
You have signed the CLA already but the status is still pending? Let us recheck it.

@ritalyu17 ritalyu17 marked this pull request as ready for review December 16, 2024 08:11
@ritalyu17
Copy link
Author

ritalyu17 commented Dec 16, 2024

The hardness benchmark is ready for review and some feedbacks.

Currently, the bayesian optimization component and multi-task component are set to two Benchmark. Main reason for seperating them is because the arguments in simulate_scenarios are different, specifically initial_data. Maybe there is a way to make the code look nicer?

Thank you!

dfComposition_temp = dfComposition_temp.sort_values(by="load")
# if there are any duplicate values for load, drop them
dfComposition_temp = dfComposition_temp.drop_duplicates(subset="load")
# if there are less than 5 values, continue to the next composition
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too verbose I think, comments like this can be removed which are very self-explanatory. Overall, just too many comments like this

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick comment from my side as I also have some stuff regarding comments in my review: I agree with @sgbaird that such individual line comments are not necessary. However, I would appreciate a bit more "high-level" comments like "Filtering composition for which less than 5 hardness values are available", descring what a full block of code is doing.

Note that I only unresolved this comment to make it easier for you to spot this comment here of mine, feel free to immediately un-resolve :)

@AVHopp
Copy link
Collaborator

AVHopp commented Dec 19, 2024

Just FYI: I will give my review here mid of January :)

Copy link
Collaborator

@AVHopp AVHopp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, thanks for the benchmark :) This is a very first and quick review since I think that minor changes from your end will simplify the review process for me quite significantly. Also, note that the way that there was a PR involving the lookup mechanism (#441 ) This might (or might not) have an influence on your benchmark here.

Hence, I would appreciate if you could rebase your example onto main, verify that this benchmark is compatible with the new lookup and include the first batch of comments. Then I'll be more than happy to give it a full and proper review :)

# create a list of dataframes with n samples from dfLookupTable_source to use as initial data
lstInitialData_temp = [dfLookupTable_source.sample(n) for _ in range(settings.n_mc_iterations)]

return simulate_scenarios(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something is weird here: You only ever call this with the latest value of n, which is 30. Why do you then create several different campaigns and lists?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing that out. Upon review, I realized I’d like to work with the same campaign but with different initial data sizes. Since the initial_data argument is only used in simulate_scenarios, do you have any suggestions on how I could do this elegantly? E.g. could initial_data be specified in Campaign class?

@AVHopp
Copy link
Collaborator

AVHopp commented Jan 28, 2025

Hello @ritalyu17 just for your information: My work load has shifted quite a bit, and it might take some time for me to properly review here. Just wanted to inform you about this :)

@ritalyu17
Copy link
Author

Thanks for the information. No rush.

Copy link
Collaborator

@AdrianSosic AdrianSosic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ritalyu17, I can take care of further integration but would like to ask you for two things before I start with my review:

  • Can you please rebase the branch on top of the latest main? That is, we need to build the PR on the latest version of the benchmarking module + I'd like to get rid of all the unnecessary merge commits since your PR pretty much orthogonal to what happens else in the repo
  • Can you reformat your files to make them compatible with our code conventions? For that, please have look at any other module of the repo and you'll see what I mean. For example, we should consistently use snake_case for variable names and CamelCase for type definitions.

Please ping me once the changes are incorporated (also in the other PR) and I'll have a look 🙃

@AdrianSosic
Copy link
Collaborator

Hi @ritalyu17, any updates from your end?

@ritalyu17
Copy link
Author

ritalyu17 commented Feb 25, 2025 via email

@ritalyu17 ritalyu17 reopened this Mar 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants