Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Bazel work gain. #2

Open
wants to merge 116 commits into
base: master
Choose a base branch
from
Open

Conversation

helly25
Copy link
Owner

@helly25 helly25 commented Oct 15, 2024

Make Bazel work gain. Add minimalisitc test that verifies the binary can at least report its own version.

PolarGoose and others added 30 commits October 24, 2024 20:53
This is to avoid duplication of the contents on gh-pages-gen branch.
Automake: generalize generation rules for docs (make split_man.py infer
language from its second argument, as POSIX make does not seem to have a way
to pass languge name to the rule). Explicitly list all example files (provide
a `find` command to generate the list in a comment).

CMake: Add forgotten install files for re2java and re2js. Use GLOB_RECURSIVE
with CONFIGURE_DEPENDS to add examples to the list of source file.

Also, update description of submatch extraction and change captures example:
use the one without POSIX-style array `yypmatch`, as not all languages support
it (Haskell doesn't, as it has no mutable arrays in the standard library).
Perform substitutions throughout the whole manual.
Split help source file for all language backends.
Also, use the word "blocks" consistently instead of "directives".
These parts are used inconsistently in configuration names, e.g. we have
`re2c:tags:expression` which has to be scoped under `tags:` but it's also a
definition, so it should be scoped under `define:` (and other examples like
this). In fact, most of the string configurations are defines of some sort.

Also, change `conf:` prefix in syntax files to `re2c:`, as in
configurations. From now on, "c" in "re2c" stands not for C/C++ language,
but rather "code" or "compiler", so it's good for all languages.

Also, use prefix `.` (dot) for conditionals in syntax files to make it
easy to distinguish them from variables in code templates.

Also, fix warnings and error messages to use backtick quotes consistently.

Also, fix some comments in the code that still used the word "directive" for
special-purpose blocks.
skvadrik and others added 30 commits December 27, 2024 09:54
…ptures.

We now have many options for captures, so we cannot assume that the
--posix-captures option is the one causing conflicts.

This partially addresses issue #518.
Currently re2c fails with error, as the special zero condition `<>` has no
end-of-inpt rule `$`.
With captvars (either leftmost or POSIX ones) we need to generate names for the
variables that correspond to capturing parentheses. The difference between
leftmost and POSIX cases is that POSIX disambiguation algorithm is more complex
and it requres additional imlicit tags inserted in the regular expression.
These so-called "fictive" tags don't have valid submatch group indices, so
trying to use them caused integer overflow and a very long loop.

This is a fix for issue #519 "re2c hangs if posix-captvars is used".
Previously re2c failed with an error in the case when end-of-input rule `$` was
enabled and the special zero condition `<>` was used. Now it no longer fails and
generates code withot end-of-input check for the zero condition (which is
handled in a special way by the code, bypassing some of the transformations that
rely on the presence of end-of-input rule).
- remove obsolete `if` branch
- make assertins more precise
- factor out codegen function for semantic actions
- rename variables
Also, rename "setup" to "exit action" to match the added "entry action"
(and also because it makes more sense).

This addresses bug #521 "Add special entry/exit rules".

Drop some not very useful debug checks.
In goto/label mode, if there is a loop via initial state, YYSKIP must be
bypassed when entering the DFA (visiting initial state for the first time).
Other modes generate YYSKIP on transitions, so they were not affected.

The bug was found by skeleton tests.

This addresses bug #521 "Add special entry/exit rules".
When all rules were inherited from other blocks, re2c failed to find correct
rule location and passed an invalid location, which caused a SIGSEGV in the
error reporting function.

This fixes bug #522 "Segfault when !use is present in a block with conditions".
Previously we only checked that tags are parwise equal between two mapped
transitions, not between all transitions going to the bitmap state. This was a
regression caused by optimization 98f2a41.

This fixes #523 "Possible regression with bit-vector optimizations between 3.x
and 4.x".
… as used.

This is only needed in loop/switch and rec/func modes, as goto/label mode uses
special YYFILL labels, not the state labels. And most of the states were already
marked as used due to other transitions going into them. So the bug was not
revealed until the recent addition of the entry rule.

This addresses bug #521 "Add special entry/exit rules".
It's unclear where the previous condition came from; the intention seems to be
to exclude final states that have no outgoing transitions on symbols. With the
end-of-input rule `$` even if all characters fall into the same range as in
`[^]`, a separate range is split for the sentinel symbol, so the previous
condition could not capture any match states.
…states.

The order of these operations doesn't matter, provided that YYFILL is correct.
…dels.

Previously initial state was marked using special initial action, which was set
only in rec/func mode. Now we explicitly store a pointer to initial state and
set it for all code models.
Remove incorrect assignment of default state to default rule state; it was
overwritten later, so it had no effect at all. Rule state should be kept
separate, as default state may be used for different things (e.g. it may be
used as the accept state that has `yyaccept` dispatch).

No changes in the generated code.
Rename `initial_state` to `start_state` and `initial_label` to
`custom_start_label` (so that it's clear that it is a special case).
Update comments.
This way it is possible to add new named special actions without having
to worry about exhausting all special symbols. Also, if we add mid-rule
actions in the future, this syntax can be used for them as well.
This addresses bug #526 "[doc]: Add other install methods on official webpage".
!pre_rule action is emitted immediately *before* any semantic action in the
current block / condition (it can be useful to have some common initialization
code). !post_rule action is the same, except that it is emitted *after* every
semantic action (it can be useful e.g. to emit unreachability checks).

This addresses bug #521 "Add special entry/exit rules".
* tested with Bazel 8.0.1 and 7.0.0 on linux and macos
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants