You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So just for my own amusement, I tried the stupidest memoization I could think of, to see if it helped. I figured hey, I already did the tracing stuff, which means that a single call is wrapping every rule check, perfect place to put memoization, right? The diffs are at https://gist.github.com/rlpowell/43ff4ab1550004b8a802406b49f7b4d2 , but all it really is is I added:
memo: Box<HashMap<String, &'static str>>,
to ParserState (yes, it's a &str, I was going for quick and dirty) and lines like this to the tracing module wherever a production failed:
This took my 5 character pathological example from ~390 seconds to 0.4 seconds. (Doing the equivalent thing with successful productions just doesn't work, and given how well failed-only worked I didn't bother to figure out why.)
So, yeah. I'd prefer to stay with Pest, partly because sunk costs and partly because it's the most popular Rust PEG parser, so if there's a version of this that y'all would be willing to mainline, please let me know; I have zero interest in maintaining my own fork.
If you do want me to polish this up, I'd love a pointer to a real-time place to discuss it, as that pretty clearly isn't the Discord.
FWIW, it turns out ( https://docs.rs/peg/latest/peg/#caching-and-left-recursion ) that rust-peg is also not a packrat parser by default, but you can mark individual productions for caching. It seems very Rust-y that the top two PEG parsers are both like "make your grammar performant that's not our job". :) rust-peg also has left-recursion support via caching, which my project also needs, so if Pest doesn't work out I'll go there. But I dunno, maybe y'all want to do a similar thing where productions can be marked for caching
So just for my own amusement, I tried the stupidest memoization I could think of, to see if it helped. I figured hey, I already did the tracing stuff, which means that a single call is wrapping every rule check, perfect place to put memoization, right? The diffs are at https://gist.github.com/rlpowell/43ff4ab1550004b8a802406b49f7b4d2 , but all it really is is I added:
to ParserState (yes, it's a &str, I was going for quick and dirty) and lines like this to the tracing module wherever a production failed:
memo_string
there is the rule plus the position.This took my 5 character pathological example from ~390 seconds to 0.4 seconds. (Doing the equivalent thing with successful productions just doesn't work, and given how well failed-only worked I didn't bother to figure out why.)
So, yeah. I'd prefer to stay with Pest, partly because sunk costs and partly because it's the most popular Rust PEG parser, so if there's a version of this that y'all would be willing to mainline, please let me know; I have zero interest in maintaining my own fork.
If you do want me to polish this up, I'd love a pointer to a real-time place to discuss it, as that pretty clearly isn't the Discord.
FWIW, it turns out ( https://docs.rs/peg/latest/peg/#caching-and-left-recursion ) that rust-peg is also not a packrat parser by default, but you can mark individual productions for caching. It seems very Rust-y that the top two PEG parsers are both like "make your grammar performant that's not our job". :) rust-peg also has left-recursion support via caching, which my project also needs, so if Pest doesn't work out I'll go there. But I dunno, maybe y'all want to do a similar thing where productions can be marked for caching
Originally posted by @rlpowell in pest-parser/pest#1081 (comment)
The text was updated successfully, but these errors were encountered: