-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load Alibaba Trace in Batches in Simulator #67
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ruizehung
changed the title
Initial attempt to load alibaba trace in chunks
Load Alibaba Trace in Batches in Simulator
Nov 9, 2023
ruizehung
changed the title
Load Alibaba Trace in Batches in Simulator
Load Alibaba Trace in Batches in Simulator (In Progress)
Nov 9, 2023
…ept a start time offset?
…nction to workload class to make workload evolvable
- Add __assert_task_has_not_been_added_to_event_queue_before in simulator to help debugging - Add some more log statement in __get_next_jobs
…t be provided when creating simulator
- Attempt to use self._randomize_start_time_max * self._job_graph_batch as time where we add LOAD_NEW_JOBS event - Have AlibabaLoader its own random instance to ensure reproducibility
…tart time offset when calling get_next_jobs
sukritkalra
approved these changes
Nov 12, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the issues are fixed now, but we'll keep a close eye on any Simulator abnormalities.
The tests are failing but that's not because of the changes on this PR, they're failing on main too.
ruizehung
changed the title
Load Alibaba Trace in Batches in Simulator (In Progress)
Load Alibaba Trace in Batches in Simulator
Nov 17, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR update
AlibabaLoader
andSimulator
such that nowSimulator
can load job graphs in batches and gradually add the tasks to event qeueue.JobGraphLoader
class as a generalized Job Graph loader that can dynamically load job graphs in simulator.AlibabaLoader
to implementJobGraphLoader
Simulator.dry_run
andSimulator.simulate
to accommodate the use of JobGraphLoader.Workload.add_job_graphs
batch_size_job_loading
flag inmain.py
to allow config files to specify loading workload in batches.Test Plan
Unit test
It's failing
https://app.warp.dev/block/AIe92hXlVj2eDh2ZOsRB8L
tests/test_tetrisched_scheduler.py::test_tetrisched_task_graph_strl_generation_simple
is already failing inmain
I haven't figured out why
tests/test_simulator.py::test_simulator_handle_event
failedDry run
configs/alibaba_trace.conf
:Ouptut:
Simulate Alibaba Trace
python3 main.py --flagfile=configs/alibaba_trace.conf
alibaba_trace_replay.log
: https://gist.github.com/ruizehung/0b58f26ad07cd53dc5e1ac51ef886fa6