Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fsm_intersect_charset(), fsm -U #470

Merged
merged 1 commit into from
May 29, 2024
Merged

Add fsm_intersect_charset(), fsm -U #470

merged 1 commit into from
May 29, 2024

Conversation

katef
Copy link
Owner

@katef katef commented May 25, 2024

Add fsm_intersect_charset(), a convenience to intersect an fsm against a given character set. I pulled this out of the forthcoming rx(1) tool.

Unlike fsm_intersect(), this doesn't automatically determinise its operands. I'm not convinced fsm_intersect() should do that, either. In most cases we care enough about performance to do this (or rather to avoid doing it unnecessarily) in the caller.

I've exposed this as fsm -U <charset>:

image

Copy link
Collaborator

@silentbicycle silentbicycle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Is it worth noting that the isn't any interpretation done to the character set? e.g. a-z is treated as ['a', '-', 'z'] rather than [a-z]? Or adding special handling if the character set starts with [ and ends with ]? That seems like it would be a common use case.

@katef
Copy link
Owner Author

katef commented May 29, 2024

I removed group parsing as an entry point into a regex grammar in 2ede0ed, which was exposed as a similar cli argument for re(1). I think I'd either want to reintroduce that (somehow...), or invent a tr(1)-style syntax (perhaps as its own dialect?) just for this. But I wasn't ready to do that right now.

@katef katef merged commit 073305b into main May 29, 2024
322 checks passed
@katef katef deleted the kate/intersect-charset branch May 29, 2024 00:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants