Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected behavior parsing of tags inside substitutions #17

Open
Jacob-Bishop opened this issue Aug 14, 2022 · 0 comments
Open

Unexpected behavior parsing of tags inside substitutions #17

Jacob-Bishop opened this issue Aug 14, 2022 · 0 comments

Comments

@Jacob-Bishop
Copy link

Jacob-Bishop commented Aug 14, 2022

Minimum reproducible example

import rhasspynlu
rhasspynlu.parse_ini("""[foo]\nmidnight:(12{hour} am)\n""")

Actual result

defaultdict(<class 'list'>,
            {'foo': [Sentence(text='midnight:(12{hour} am)',
                              tag=None,
                              substitution=None,
                              converters=[],
                              items=[Word(text='midnight',
                                          tag=Tag(substitution=None,
                                                  converters=[],
                                                  tag_text='hour'),
                                          substitution='12',
                                          converters=[]),
                                     Word(text='am)',
                                          tag=None,
                                          substitution=None,
                                          converters=[])],
                              type=<SequenceType.GROUP: 'group'>)]})

Note in particular that we have am) parsed as a word (including the trailing parenthesis).

Expected result
I'm not sure exactly what the tree should look like, but I definitely didn't expect the trailing parenthesis from the substitution group to be parsed as part of a word. I would broadly expect this to be functionally similar to doing the following (below), except maybe with midnight as the text value for both "12" and "am" instead of as a separate word?

rhasspynlu.parse_ini("""[foo]\nmidnight: :12{hour} :am\n""")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant