-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RPC] Fetch TPCH profiled info for application at runtime #92
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial pass.
rpc/service.py
Outdated
@@ -19,6 +19,7 @@ | |||
|
|||
from schedulers import EDFScheduler, FIFOScheduler | |||
from utils import EventTime, setup_logging | |||
from utils_tpch import get_all_stage_info_for_query |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of making this a top-level utils_tpch.py
, can you make a folder inside rpc
named tpch
and name it utils.py
inside?
Please also include the *.npy
files that we are using for initial testing from Decima.
rpc/service.py
Outdated
default_resource = Resources(resource_vector={ | ||
Resource(name="Slot_CPU", _id="any"): 30 | ||
} | ||
) | ||
default_runtime = EventTime(20, EventTime.Unit.US) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you using this in the code below?
rpc/service.py
Outdated
for task_dependency in request.dependencies: | ||
framework_task = task_dependency.key | ||
if is_tpch_query: | ||
task_slots = tpch_query_all_stage_info[framework_task.id]["num_tasks"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain how you're matching the correct stage with its actual resource usage and runtime here?
utils_tpch.py
Outdated
import numpy as np | ||
import os |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The build is failing, please look at the errors and resolve them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM since tested.
Minor comments
default_resource = Resources( | ||
resource_vector={Resource(name="Slot_CPU", _id="any"): 30} | ||
) | ||
default_runtime = EventTime(20, EventTime.Unit.US) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default_runtime = EventTime(20, EventTime.Unit.US) | |
default_runtime = EventTime(20, EventTime.Unit.S) |
Do you have any runtimes that are lower than a second?
def get_all_stage_info_for_query(query_num): | ||
task_durations = np.load( | ||
os.path.join( | ||
HOME_TPCH_DIR, TPCH_SUBDIR, "task_duration_" + str(query_num) + ".npy" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we get the TPCH_SUBDIR
label from the application name?
We might want to run a mix of TPC-H queries from different scaling sizes.
return False, None | ||
|
||
|
||
def verify_and_relable_tpch_app_graph(query_num, dependencies): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def verify_and_relable_tpch_app_graph(query_num, dependencies): | |
def verify_and_relabel_tpch_app_graph(query_num, dependencies): |
Typo
No description provided.