-
-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Endleaf rejiggle, .fsm syntax for endids, refactor endid get/set api, various related bugfixes #479
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
These are optional. I'd expected this to be per state at file scope, but I don't like that the syntax would then allow attempting to attach an id to a non-accepting state. Since we have a block for `end:` already, adding the end ids here means they can't be mixed up. This looks like: ``` ; ./build/bin/re -r pcre -bpl fsm 'ab?c' 'abc*' 0 -> 1 "a"; # e.g. "a" 1 -> 2 "b"; # e.g. "ab" 1 -> 3 "c"; # e.g. "ac" 2 -> 4 "c"; # e.g. "abc" 4 -> 5 "c"; # e.g. "abcc" 5 -> 5 "c"; # e.g. "abcc" start: 0; end: 2 = [0], 3 = [0], 4 = [0, 1], 5 = [0]; ```
Two things here: Firstly I've reworked all this stuff such that we handle filenames, and then any remaining arguments are used to match with fsm_exec. This is what I'd originally intended, but I think it got lost at some point, probably when introducing operators with arity 2. Secondly matching text with fsm_exec now runs regarless of the operation, not just on OP_IDENTITY. I see no reason to limit that.
The file format is complex enough that I want to cover these explicitly now, rather than relying on coverage through other tests.
This should've been done when updating to fsm_state_t.
This is never actually reached, because fsm_collate() doesn't handle end IDs.
I don't see any reason callers of fsm_endid_get() would want to fetch into a fixed-size buffer of fewer elements than the number of endids an end state has. In all cases the caller can fsm_endid_count() to find out.
This keeps the generated code C89-compliant.
If these were neccessary for caller-supplied generated code from the endleaf callbacks, the actual way I'm intending a caller to pass information there, is to output with `opt.fragment` set. Then there's no need to gi via `void *` for any caller state, that can all be exposed as appropriately typed variables in scope for the generated code (or wrapped in a similar function as the non-fragment code, and exposed as appropriately-typed arguments to that function). The existing `void *opaque` here was nothing to do with this, it was actually for the `.getc()` callback. I've renamed it to keep it from getting mixed up.
I'm not removing this for any profound reason, just that fewer kinds of structs help me better see what's happening.
Now this is always equivalent to the count passed in, when the return status is 1. And the return status is always 1 when the count is enough. In all situations we know the count is enough.
silentbicycle
requested changes
Jun 17, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a better suggestion for how to pass a handle to the caller besides void *opaque
that isn't significantly more complicated in a multi-threaded context?
silentbicycle
approved these changes
Jun 18, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I did quite a few things here, all focused around the endleaf callbacks,
and by extension the endid mechanism. I fixed a few small bugs, and got distracted trying to tidy up a bit.
Anyway here's what we get:
-G length
inconsistent, moreover returns error code. #478 and Unclear how to do multiple transformations in one invocation #477 separatelyThe .fsm syntax for endids looks like this:
and renders out like: