Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run on.exit() expressions when loops break early #58

Merged
merged 13 commits into from
Nov 5, 2024
Merged

Conversation

lionel-
Copy link
Member

@lionel- lionel- commented Oct 31, 2024

Fixes #52.

A generator is allowed to register exit handlers via on.exit() or similar (e.g. withr::defer()). Currently these exit expressions only run when the generator returns, or on unexpected exit such as an error happening while the generator is executing. They are not run when an iteration driver decides to cancel the iteration, or is interrupted by an unexpected exit. E.g.:

g <- coro::generator(function() {
  on.exit(cat("cleanup\n"))
  yield(1)
  yield(2)
})

# Cleanup not run
coro::loop(for (i in g()) {
  break
})

# Cleanup not run
coro::loop(for (i in g()) {
  stop("foo")
})

To fix this:

  • We introduce the notion of closing an iterator. A closable iterator has a close argument in its formals. When it does, coro loop drivers (either coro::loop() or for loops inside generators) call the iterator with close = TRUE when iteration has finished or is interrupted. The iterator can then clean up resources.

    A closable iterator should ignore any further call with close = TRUE as coro and other iteration drivers are allowed to do that multiple times to simplify their implementations. They are also allowed to call an exhausted iterator with close = TRUE, even though resource cleanup should have been accomplished already.

  • Generators now produce iterators with a close argument. Closing a generator has two effects: First, all iterators in active for loops are closed, from the most nested to the least nested one. This triggers a cleanup chain across generators/iterators. Second, the generator's own on.exit() expressions are run.

  • coro::loop() now closes its iterator on exit.

The same applies to async generators.

Note that Python also closes generators on garbage collection. I decided not to do that because resource cleanup should be hierarchical. Instead, it's up to the generator caller to ensure proper closing of their generator. The easiest way is to use coro::loop(). Javascript doesn't close on GC either, see https://esdiscuss.org/topic/garbage-collection-in-generators for some discussion.

@lionel- lionel- requested a review from jcheng5 October 31, 2024 11:19
@jcheng5
Copy link
Member

jcheng5 commented Nov 5, 2024

This looks great to me! I don't have the context to review the implementation but I tested some scenarios and everything worked as I thought it might.

The one area I was slightly surprised by is when the for-loop body raises an exception, and an on.exit cleanup expression also raises an exception (which is always a tough situation to do anything sensible about).

To set a precedent, let's take coro totally out of it and try a regular on.exit:

> (function() {
+   on.exit(stop("boom2"))
+   stop("boom")
+ })()
Error in (function() { : boom
Error in (function() { : boom2

I'm a little surprised both errors print; if I put a try() around it, only boom2 is shown. I'm guessing this will be immediately intuitive to you why this happens.

Synchronous coro:

> g <- coro::generator(function() {
+   on.exit(stop("boom2"))
+   yield(1)
+   yield(2)
+ })
> 
> coro::loop(
+   for (i in g()) {
+     stop("boom")
+   }
+ )
Error in eval_bare(loop, env) : boom
Error in evalq(envir = user_env, base::evalq(envir = rlang::wref_key(<weak reference>),  : 
  boom2

Pretty similar.

Async coro:

> g <- coro::async_generator(function() {
+     on.exit(stop("boom2"))
+     yield(1)
+     yield(2)
+ })
> 
> coro::async(function() {
+     for (i in await_each(g())) {
+         stop("boom")
+     }
+ })()
> 
Unhandled promise error: boom2

No boom1, it's been lost along the way.

I just thought this was interesting, I'm not saying anything needs to change about this behavior, necessarily. Like I said, cleanup code throwing exceptions is almost always a bad time.

@lionel-
Copy link
Member Author

lionel- commented Nov 5, 2024

Good catch! I tried to model base R's behaviour in generators but for async functions that's trickier. The thing is that in non async code, errors are handled "on the spot" by R's default handler (the one that prints the message to stderr and jumps to top level). I'm not sure what would be the ideal behaviour in async functions.

Since that's a rather specific edge case, I agree we can just merge this as is for now as it's a vast improvement for generator cleanups and revisit the async case in the future.

@lionel- lionel- merged commit 1235934 into main Nov 5, 2024
10 checks passed
@lionel- lionel- deleted the feature/close branch November 5, 2024 07:50
@lionel-
Copy link
Member Author

lionel- commented Nov 5, 2024

oh probably worth noting that the async case is consistent with:

 f <- function() {
   on.exit(stop("boom2"))
   stop("boom")
}
tryCatch(
  f(),
  error = identity
)
#> <simpleError in f(): boom2>

And in Positron we use global handlers to capture errors and see this same behaviour where the first error silently disappears:

f()
#> Error in `f()`:
#> ! boom2

So I think this is just a consequence of handling errors at catch sites rather than at error sites.

@jcheng5
Copy link
Member

jcheng5 commented Nov 6, 2024

Makes total sense. If anything, seems like maybe I should have put more thought into how the promises package could do condition handling more like R instead of doing a pretty literal port of JavaScript's promises.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

on.exit() only called when iterator is exhausted
2 participants