Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: solve deep recursion issues on some ZDag functions #1183

Merged
merged 1 commit into from
Mar 6, 2024

Conversation

arthurpaulino
Copy link
Contributor

Now that we're sending Lurk data through the wire, it's important that we don't hit recursion depth limits on arbitrarily deep data.

This PR closes #1021

@arthurpaulino arthurpaulino requested review from a team as code owners February 28, 2024 22:20
@arthurpaulino arthurpaulino enabled auto-merge March 2, 2024 14:17
Comment on lines 141 to 151
macro_rules! feed_loop {
($x:expr) => {
if !cache.contains_key(&$x) {
if ptrs.insert($x) {
stack.push($x);
}
}
};
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it wise to collect a transitive depth-first traversal from ptr, and only recurse afterwards? That is, I expect the ptrs structure could get really huge.

Another way to proceed to remove the recursion limit would be indeed to use a stack, but change the signature of recurse from recurse(ptr, cache) to recurse(stack, cache). Then in each call, you'd pop an item off the stack, do what you already do in recurse(ptr, cache) and populate the stack with the same feed_loop logic used here below. I think this might lead to the same order of processing, but would not require materializing the whole pointer graph in a stack ahead of time (leading to nicer memory usage).

Is there a reason to do this full initial graph traversal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is that we need to go from leaves to the root. This is similar to what happens during hydration: in order to hash a Ptr, we need the hashes of its children.

Then in each call, you'd pop an item off the stack, do what you already do in recurse(ptr, cache)

That's not possible because of what I said above. The reason to do the initial graph traversal is to populate the cache with the hashes of all children that are required.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no problem with the depth-first search. I believe that for each node that has two siblings (which you'll deal with one after the other), before recursing, you will materialize in a data structure all the descendants of both siblings, rather than the descendants of just the one of the two you will process first.

I believe that if you switch between stacking more elements and processing those elements, you can get to better memory utilization, more cache hits, and potentially marginally less code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recurse doesn't call recurse. I don't think this change you want is this simple. It would require rethinking the whole implementation

Copy link
Contributor Author

@arthurpaulino arthurpaulino Mar 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@huitseeker let me try to give some perspective to this.

We can't solve this recursively otherwise we incur in the original problem of recursion depth. The solution has to feed the cache iteratively, making every call to recurse trivial (every child will be cached).

On the memory issue, I think it's really minor. The memory consumption involved in populating those IndexMaps is probably irrelevant in comparison with the bigger problem we're solving, which is proving. Also, the entirety of the dag will have to end up inhabiting the Store (or the ZStore, depending on the direction we're going) anyway.

Nevertheless, I opened #1197 so we can deal with that later. For now, I don't think this is worth more cycles. Please let me know what you think.

@@ -191,10 +236,33 @@ impl<F: LurkField> ZDag<F> {
Ok(ptr)
}
};
recurse(z_ptr)
let mut z_ptrs: IndexSet<&ZPtr<F>> = IndexSet::default();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Roughly the same comments as above apply.

@arthurpaulino arthurpaulino force-pushed the ap/fix-z-dag-deep-recursions branch from 057aa42 to bc052d9 Compare March 4, 2024 16:57
@arthurpaulino
Copy link
Contributor Author

@huitseeker I've updated the PR to use a better name for what I had previously called ptrs

@arthurpaulino arthurpaulino force-pushed the ap/fix-z-dag-deep-recursions branch 2 times, most recently from 7e13ee5 to c63f828 Compare March 5, 2024 16:02
@arthurpaulino arthurpaulino force-pushed the ap/fix-z-dag-deep-recursions branch from c63f828 to c72f63d Compare March 6, 2024 14:07
Copy link
Contributor

@huitseeker huitseeker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

@arthurpaulino arthurpaulino added this pull request to the merge queue Mar 6, 2024
Merged via the queue into main with commit 013410f Mar 6, 2024
11 checks passed
@arthurpaulino arthurpaulino deleted the ap/fix-z-dag-deep-recursions branch March 6, 2024 23:22
arthurpaulino added a commit that referenced this pull request Mar 7, 2024
PR #1183 didn't include the logic for environments when collecting
the children of pointers. We didn't see the error because our tests
weren't checking for environment roundtrip through Z data.

Here we fix that bug and also enhance the tests to avoid future
regressions.
github-merge-queue bot pushed a commit that referenced this pull request Mar 7, 2024
PR #1183 didn't include the logic for environments when collecting
the children of pointers. We didn't see the error because our tests
weren't checking for environment roundtrip through Z data.

Here we fix that bug and also enhance the tests to avoid future
regressions.
arthurpaulino added a commit that referenced this pull request Mar 7, 2024
It's safer to use a new variant `RawPtr::Env` to signal raw pointers
for environments than relying solely on tag checks. This would
probably have avoided the bug introduced in #1183, whose fix was
implemented in #1200
arthurpaulino added a commit that referenced this pull request Mar 7, 2024
It's safer to use a new variant `RawPtr::Env` to signal raw pointers
for environments than relying solely on tag checks. This would
probably have avoided the bug introduced in #1183, whose fix was
implemented in #1200
arthurpaulino added a commit that referenced this pull request Mar 7, 2024
It's safer to use a new variant `RawPtr::Env` to signal raw pointers
for environments than relying solely on tag checks. This would
probably have avoided the bug introduced in #1183, whose fix was
implemented in #1200
arthurpaulino added a commit that referenced this pull request Mar 8, 2024
It's safer to use a new variant `RawPtr::Env` to signal raw pointers
for environments than relying solely on tag checks. This would
probably have avoided the bug introduced in #1183, whose fix was
implemented in #1200
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Potential recursion depth limit in ZDag
2 participants