Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tagging words for question words supports only tagging one of them #70

Open
r0ller opened this issue Dec 10, 2023 · 1 comment
Open

Comments

@r0ller
Copy link
Owner

r0ller commented Dec 10, 2023

When tagging a word in case of indicative mood for later questions like:

sparser->add_feature_to_leaf(ENG_N_Sg,"N",std::string("qw_what"));

where the tag was setup like:

insert into FUNCTOR_TAGS values('FILEENGN', '1', 'qw_what', '1', 'qword', 'what');

the problem arises when there are two words or even phrases of words that can be answered to the same question words like 'what'. E.g. (a non perfect example):

Peter is a gentleman and a pirate at the same time.

Here 'gentleman' and 'pirate' would be tagged with the question word 'what' which is not really a problem as both can be answer to the question:

What is Peter?

However, there are certainly cases when these tagged words collide/interfere/etc. One solution could be to allow tags to have a technical suffix e.g. the node id number so tagging would look like:

node_id=...//get node_id
sparser->add_feature_to_leaf(ENG_N_Sg,"N",std::string("qw_what#"+node_id));

This requires the logic checking for allowed tags in FUNCTOR_TAGS to allow such suffixing and also adjusting the query logic to be able to look for such suffixed tags.

To take it a bit higher level: nodes (i.e. subtrees) could be made taggable as well to be able to answer a question with a part of a previous sentence.

@r0ller
Copy link
Owner Author

r0ller commented Mar 18, 2024

Though not yet committed, tagging works for more than one leaf and also for nodes (subtrees) and the tags are pushed to the leaves along with the leaf and parent node ids. The parent node ids are required to be able to look up the corresponding node in the syntactic parse tree and rebuild the subtree on the client side as json objects so that all syntactic details are available. The next question is how a client can interpret the result. A feasible way could be that once hi_query() delivers the hits, the client takes the best ranked hit and making use of the tag "qword", gets the question word from the result returned by hi(). Using that question word, the answer can be looked up in the result returned by hi_query(). However, there are cases where this is too simple and leads to ambiguous answers, e.g.:

-I bought her ice cream to stop her crying.
-What did you buy?

Since both "ice cream" and "crying" can be tagged as an answer to the question word "what", simply looking up the answer to the question "what did you buy" by looking for the qword tag "what" is not sufficient as it will yield two results: "ice cream" and "crying". So the client will need to use for the search not only the qword tagged lexemes but the lexemes used in the question itself ("did":DO+past tense+sg3rd,"you":pron+sg2nd,"buy":BUY) returned in the result of the call to hi() to analyse the question.

r0ller added a commit that referenced this issue Mar 24, 2024
r0ller added a commit that referenced this issue Mar 24, 2024
r0ller added a commit that referenced this issue Aug 15, 2024
r0ller added a commit that referenced this issue Aug 15, 2024
implement #39, #70, added structured language and run time mixed language support
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant