Skip to content

Commit

Permalink
Change prefix logic to check for following vowel
Browse files Browse the repository at this point in the history
  • Loading branch information
dchiller committed Apr 20, 2024
1 parent 8d6a1ec commit 3d47cff
Show file tree
Hide file tree
Showing 3 changed files with 9 additions and 5 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -157,7 +157,7 @@ flowchart TD
A --> J
```

*Note (1)*: Prefixes considered are "ab", "ob", "ad", "per", "sub", "in", and "con".
*Note (1)*: Prefixes considered are "ab", "ob", "ad", "per", "sub", "in", "con" and "co". Prefixes are only removed when they are followed by a vowel; if not followed by a vowel, the rules regarding consonant placement are the same for the prefix as the rest of the word. An example will help illustrate. The word "perviam" should be syllabified "per-vi-am": the division of "rv" into two separate syllables follows the general rule of consonant placement (add the first consonant to the preceding syllable and the second consonant to the following syllable). The word "periurem", however, should be syllabified "per-iu-rem." Here, the general rule of consonant placement would call for the "r" to adhere to the following syllable. Because it is a prefix, however, the "r" stays in the first syllable.

*Note (2)*: Written "i"s and "y"s may be semivowels and written "u"s may be semi-vowels or consonants.
"I"s are semivowels:
Expand Down
3 changes: 2 additions & 1 deletion tests/word_syllabification_tests.csv
Original file line number Diff line number Diff line change
Expand Up @@ -90,4 +90,5 @@ adincresco,ad-in-cre-sco,
compressans,com-pres-sans,
principem,prin-ci-pem,
redemptor,re-demp-tor,
imperator,im-pe-ra-tor
imperator,im-pe-ra-tor
coegerunt,co-e-ge-runt
9 changes: 6 additions & 3 deletions volpiano_display_utilities/latin_word_syllabification.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@

# Prefix groups are groups of characters that serve as common prefixes. For details,
# see README.
_PREFIX_GROUPS: set = {"ab", "ob", "ad", "per", "sub", "in", "con"}
_PREFIX_GROUPS: set = {"ab", "ob", "ad", "per", "sub", "in", "con", "co"}

_VOWELS: set = {"a", "e", "i", "o", "u", "y"}
_VOWELS_AEOU: set = {"a", "e", "o", "u"}
Expand Down Expand Up @@ -96,7 +96,8 @@ def split_word_by_syl_bounds(word: str, syl_bounds: List[int]) -> List[str]:

def _get_prefixes(word: str) -> str:
"""
Returns the prefix of a word, if it has one.
Returns the prefix of a word, if it has one that is followed by a vowel.
FOr details on prefixes, see README.
word [str]: word to check for prefix
Expand All @@ -107,7 +108,9 @@ def _get_prefixes(word: str) -> str:
# If the word is itself one of the prefixes (eg. "in" can
# be a word or a prefix), doen't return a prefix
if word.startswith(prefix) and (word != prefix):
return prefix
prefix_length = len(prefix)
if word[prefix_length] in _VOWELS:
return prefix
return ""


Expand Down

0 comments on commit 3d47cff

Please sign in to comment.