Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement nmod:desc for honorific pre-nominal titles: Mr., Dr., etc. #561

Open
1 task done
nschneid opened this issue Dec 23, 2024 · 13 comments
Open
1 task done

Implement nmod:desc for honorific pre-nominal titles: Mr., Dr., etc. #561

nschneid opened this issue Dec 23, 2024 · 13 comments
Labels
mischievous nominal See https://arxiv.org/abs/2108.12928

Comments

@nschneid
Copy link
Contributor

nschneid commented Dec 23, 2024

Manually implementing the few that are not flat (many instances of "St." for "Saint" are compound). Will make a rule for the flat ones.

nschneid added a commit that referenced this issue Dec 23, 2024
… titles; amod for Postmaster/Registrar General; structure of Lt. Governor/General (#561)
@nschneid nschneid added the mischievous nominal See https://arxiv.org/abs/2108.12928 label Dec 23, 2024
@nschneid
Copy link
Contributor Author

nschneid commented Dec 23, 2024

@nschneid
Copy link
Contributor Author

nschneid commented Dec 24, 2024

I reviewed flat-initial PROPNs in GUM and EWT and filtered out the honorifics. The lexical lists in the query can be used for a conversion rule.

In the above review, found a few GUM errors:

  • PROPN should be ADJ: Emeritus (Professor), Golden (Gate), Chief (Justice)
  • wrong structure (I would go with compound): Comic Corps, Rationalist Atheists, Senate Democrats, State Department, NASA Administrator, US President, Vice President

N.B. "Dean" and "Earl" are in principle ambiguous between titles and first names, but in the data they are only the latter.

nschneid added a commit that referenced this issue Dec 30, 2024
…cript flat2nmoddesc.ini (a handful requiring manual correction; see comment in script)
nschneid added a commit that referenced this issue Dec 30, 2024
@nschneid
Copy link
Contributor Author

Mostly done with EWT but a few edeps from relative clauses to relativizer antecedents got deleted—need to investigate. (I think 2 new rules after invert headedness will solve this: enhanced relation constraints #4~#1;#2.*#4, #4~#1;#4.*#2, mark #4:edep as OLD and add #4~#2. Where #1 is the honorific, now the head of #2, which begins the name part.)

@amir-zeldes
Copy link
Contributor

Could you share your whole depedit script as a starting point for GUM too? We won't need the edeps part but the basic dependencies would benefit from it!

@AngledLuffa
Copy link
Contributor

Oh, if such a script already exists, we could run that on ParTUT as well. Could pass it along to @LarsAhrenberg for LinES

is updating ParTUT going to be a community effort? seems like the author is still out there occasionally doing things, but only very occasionally

@nschneid
Copy link
Contributor Author

nschneid commented Jan 4, 2025

@nschneid
Copy link
Contributor Author

nschneid commented Jan 4, 2025

is updating ParTUT going to be a community effort?

I can answer annotation questions about other treebanks but in terms of proactive updates, I have my hands pretty well full maintaining EWT.

@amir-zeldes
Copy link
Contributor

Yeah, same here - I have my hands full at the moment I'm afraid!

@AngledLuffa
Copy link
Contributor

phrases such as The playwright and critic George Bernard Shaw do not get updated with nmod:desc?

and what is the correct arrangement for Drs. Foo and Bar? nmod:desc(Foo, Drs.) and maybe an enhanced dependency nmod:desc(Bar, Drs.) if the treebank supports it, which is not the case for ParTUT anyway?

@nschneid
Copy link
Contributor Author

phrases such as The playwright and critic George Bernard Shaw do not get updated with nmod:desc?

No, the definite article indicates that there are 2 full nominal phrases (related by appos)

and what is the correct arrangement for Drs. Foo and Bar? nmod:desc(Foo, Drs.)

Yes

and maybe an enhanced dependency nmod:desc(Bar, Drs.) if the treebank supports it, which is not the case for ParTUT anyway?

Not sure about the enhanced dependency, especially since "Drs." is explicitly plural, and thus applies to the full coordination

@amir-zeldes
Copy link
Contributor

So, quick question - do we use nmod:desc for "St. Petersburg"?

@nschneid
Copy link
Contributor Author

nschneid commented Feb 4, 2025

I'd say yes, because the contribution of the "St." originally is as a title (of a name that got turned into a place name)

@amir-zeldes
Copy link
Contributor

👍 Doing the same with St. Louis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mischievous nominal See https://arxiv.org/abs/2108.12928
Projects
None yet
Development

No branches or pull requests

3 participants