-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: commit classes trie concurrently #315
refactor: commit classes trie concurrently #315
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #315 +/- ##
==========================================
+ Coverage 70.59% 72.00% +1.40%
==========================================
Files 38 38
Lines 2095 2211 +116
Branches 2095 2211 +116
==========================================
+ Hits 1479 1592 +113
- Misses 546 547 +1
- Partials 70 72 +2 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: 0 of 1 files reviewed, 1 unresolved discussion (waiting on @nimrod-starkware and @TzahiTaub)
crates/committer/src/patricia_merkle_tree/filled_tree/forest.rs
line 61 at r1 (raw file):
Arc::new(updated_forest.classes_trie), Arc::new(classes_updates), );
does the task start running the moment it's created?
in python, task t
will not start running until someone await
s it;
perhaps you need to add this task to the JoinSet
below?
And maybe better to create a top-level JoinSet with two tasks: one for the contracts tree and one for the classes tree (contracts tree JoinSet will have many storage tree tasks - i.e. it opens an inner JoinSet).
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 files reviewed, 1 unresolved discussion (waiting on @dorimedini-starkware and @TzahiTaub)
crates/committer/src/patricia_merkle_tree/filled_tree/forest.rs
line 61 at r1 (raw file):
Previously, dorimedini-starkware wrote…
does the task start running the moment it's created?
in python, taskt
will not start running until someoneawait
s it;
perhaps you need to add this task to theJoinSet
below?
And maybe better to create a top-level JoinSet with two tasks: one for the contracts tree and one for the classes tree (contracts tree JoinSet will have many storage tree tasks - i.e. it opens an inner JoinSet).
WDYT?
good catch!
it didn't start running; now it starts immediately when using tokio::spawn.
is there any benefit to having two joinsets?
52b93d4
to
5a8daeb
Compare
Benchmark movements: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on @TzahiTaub)
crates/committer/src/patricia_merkle_tree/filled_tree/forest.rs
line 61 at r1 (raw file):
Previously, nimrod-starkware wrote…
good catch!
it didn't start running; now it starts immediately when usingtokio::spawn.
is there any benefit to having two joinsets?
one inside the other, I think:
await JoinSet(class_trie_commitment_task, contract_trie_commitment_task)
where contract_tree_commitment_task
is:
- opening a
JoinSet
with all storage tries - awaiting them all
- computing the contract trie
however, if tokio::spawn
already starts the task then this current way is fine.
I'm only concerned about work stealing: if tokio::spawn
moves context from current function to ClassesTrie::create
, then you will start working on the classes trie before the other task(s) are spawned.
Maybe better to create both class_trie_commitment_task
and contract_trie_commitment_task
without them actually running, and then open a JoinSet
for the two of them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! all files reviewed, all discussions resolved (waiting on @TzahiTaub)
crates/committer/src/patricia_merkle_tree/filled_tree/forest.rs
line 61 at r1 (raw file):
Previously, dorimedini-starkware wrote…
one inside the other, I think:
await JoinSet(class_trie_commitment_task, contract_trie_commitment_task)
where
contract_tree_commitment_task
is:
- opening a
JoinSet
with all storage tries- awaiting them all
- computing the contract trie
however, if
tokio::spawn
already starts the task then this current way is fine.
I'm only concerned about work stealing: iftokio::spawn
moves context from current function toClassesTrie::create
, then you will start working on the classes trie before the other task(s) are spawned.Maybe better to create both
class_trie_commitment_task
andcontract_trie_commitment_task
without them actually running, and then open aJoinSet
for the two of them?
is that better now?
5a8daeb
to
e91c9a6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r3, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @nimrod-starkware and @TzahiTaub)
crates/committer/src/patricia_merkle_tree/filled_tree/forest.rs
line 61 at r1 (raw file):
Previously, nimrod-starkware wrote…
is that better now?
see below, maybe it's clearer
crates/committer/src/patricia_merkle_tree/filled_tree/forest.rs
line 102 at r3 (raw file):
Arc::new(contracts_trie_modifications), ); let (classes_trie, contracts_trie) = join!(classes_trie_task, contracts_trie_task);
- IIRC
join!
doesn't work as we would expect here, please confer with @aner-starkware - Now, as I understand it, you (a) compute storage tries, and (b) in parallel, compute the classes and the contracts tries. (a) and (b) are sequential. However, the classes trie computation is orthogonal to the storage/contract trie computations... execution tree should look something like this:
await join(
task { compute(classes_trie) },
task {
await join(
task { compute(storage_trie) } for storage_trie in storage_tries
);
task { compute(contract_trie) }
}
)
it currently looks like this:
await join(
task { compute(storage_trie) } for storage_trie in storage_tries
);
await join(
task { compute(contract_trie) },
task { compute(classes_trie) },
)
Code quote:
let classes_trie_task = ClassesTrie::create::<TH>(
Arc::new(updated_forest.classes_trie),
Arc::new(classes_updates),
);
let contracts_trie_task = ContractsTrie::create::<TH>(
Arc::new(updated_forest.contracts_trie),
Arc::new(contracts_trie_modifications),
);
let (classes_trie, contracts_trie) = join!(classes_trie_task, contracts_trie_task);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r2.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @aner-starkware and @nimrod-starkware)
crates/committer/src/patricia_merkle_tree/filled_tree/forest.rs
line 65 at r2 (raw file):
let mut contracts_trie_modifications = HashMap::new(); let mut filled_storage_tries = HashMap::new(); let mut tasks = JoinSet::new();
Suggestion:
contracts_state_tasks
crates/committer/src/patricia_merkle_tree/filled_tree/forest.rs
line 102 at r3 (raw file):
Previously, dorimedini-starkware wrote…
- IIRC
join!
doesn't work as we would expect here, please confer with @aner-starkware- Now, as I understand it, you (a) compute storage tries, and (b) in parallel, compute the classes and the contracts tries. (a) and (b) are sequential. However, the classes trie computation is orthogonal to the storage/contract trie computations... execution tree should look something like this:
await join( task { compute(classes_trie) }, task { await join( task { compute(storage_trie) } for storage_trie in storage_tries ); task { compute(contract_trie) } } )
it currently looks like this:
await join( task { compute(storage_trie) } for storage_trie in storage_tries ); await join( task { compute(contract_trie) }, task { compute(classes_trie) }, )
Does the join
call order matter to anything? I would assume only the tasks creation order (the spawn
call) is important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @aner-starkware and @nimrod-starkware)
crates/committer/src/patricia_merkle_tree/filled_tree/forest.rs
line 102 at r3 (raw file):
Previously, TzahiTaub (Tzahi) wrote…
Does the
join
call order matter to anything? I would assume only the tasks creation order (thespawn
call) is important.
the tasks.join_next().await
above matters, it blocks (so class trie and contract trie tasks are only spawned after all storage tries are complete)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @aner-starkware and @nimrod-starkware)
crates/committer/src/patricia_merkle_tree/filled_tree/forest.rs
line 97 at r3 (raw file):
Arc::new(updated_forest.classes_trie), Arc::new(classes_updates), );
This seems worst to me unless I'm missing something. Now we don't start computing the classes tri until after the storage tries computation finishes.
Code quote:
let classes_trie_task = ClassesTrie::create::<TH>(
Arc::new(updated_forest.classes_trie),
Arc::new(classes_updates),
);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @aner-starkware and @TzahiTaub)
crates/committer/src/patricia_merkle_tree/filled_tree/forest.rs
line 61 at r1 (raw file):
I'm only concerned about work stealing: if
tokio::spawn
moves context from current function toClassesTrie::create
, then you will start working on the classes trie before the other task(s) are spawned.
The spawn
creates another task (which I guess will run on another core if we have more than one). The main thread will spawn the tasks for the storage tries and I think it will run in parallel, no?
crates/committer/src/patricia_merkle_tree/filled_tree/forest.rs
line 97 at r3 (raw file):
Previously, TzahiTaub (Tzahi) wrote…
This seems worst to me unless I'm missing something. Now we don't start computing the classes tri until after the storage tries computation finishes.
I agree
crates/committer/src/patricia_merkle_tree/filled_tree/forest.rs
line 102 at r3 (raw file):
Previously, dorimedini-starkware wrote…
the
tasks.join_next().await
above matters, it blocks (so class trie and contract trie tasks are only spawned after all storage tries are complete)
you are right; the join!
runs both tasks concurrently but not in parallel.
How about now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @aner-starkware and @TzahiTaub)
crates/committer/src/patricia_merkle_tree/filled_tree/forest.rs
line 61 at r1 (raw file):
Previously, nimrod-starkware wrote…
I'm only concerned about work stealing: if
tokio::spawn
moves context from current function toClassesTrie::create
, then you will start working on the classes trie before the other task(s) are spawned.The
spawn
creates another task (which I guess will run on another core if we have more than one). The main thread will spawn the tasks for the storage tries and I think it will run in parallel, no?
ohh now i understand... the classes trie on it's own may use more than one core..
e91c9a6
to
e103dbe
Compare
Benchmark movements: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r4, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @aner-starkware)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on @aner-starkware)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! all files reviewed, all discussions resolved (waiting on @aner-starkware)
This change is![Reviewable](https://camo.githubusercontent.com/1541c4039185914e83657d3683ec25920c672c6c5c7ab4240ee7bff601adec0b/68747470733a2f2f72657669657761626c652e696f2f7265766965775f627574746f6e2e737667)