diff --git a/data/gdrive/Agglutinative_Sumerian.docx b/data/gdrive/Agglutinative_Sumerian.docx index ba62c3c..f6c0412 100644 Binary files a/data/gdrive/Agglutinative_Sumerian.docx and b/data/gdrive/Agglutinative_Sumerian.docx differ diff --git a/data/gdrive/Composition_Derivation_Old_High_German.docx b/data/gdrive/Composition_Derivation_Old_High_German.docx index 376ca09..b33d051 100644 Binary files a/data/gdrive/Composition_Derivation_Old_High_German.docx and b/data/gdrive/Composition_Derivation_Old_High_German.docx differ diff --git a/data/gdrive/Italia_wordform_generation_Stefania_24062021.docx b/data/gdrive/Italia_wordform_generation_Stefania_24062021.docx index 353b252..67174ed 100644 Binary files a/data/gdrive/Italia_wordform_generation_Stefania_24062021.docx and b/data/gdrive/Italia_wordform_generation_Stefania_24062021.docx differ diff --git a/data/gdrive/Italian.docx b/data/gdrive/Italian.docx index 2641b50..2f49ec5 100644 Binary files a/data/gdrive/Italian.docx and b/data/gdrive/Italian.docx differ diff --git a/data/gdrive/Latin_Word_Formation/Latin_in_the_Word_Formation_Latin_Lexicon.docx b/data/gdrive/Latin_Word_Formation/Latin_in_the_Word_Formation_Latin_Lexicon.docx index 41ea558..6fe38c8 100644 Binary files a/data/gdrive/Latin_Word_Formation/Latin_in_the_Word_Formation_Latin_Lexicon.docx and b/data/gdrive/Latin_Word_Formation/Latin_in_the_Word_Formation_Latin_Lexicon.docx differ diff --git a/data/gdrive/Polysynthetic_Inuktitut.docx b/data/gdrive/Polysynthetic_Inuktitut.docx index e580254..1cb3122 100644 Binary files a/data/gdrive/Polysynthetic_Inuktitut.docx and b/data/gdrive/Polysynthetic_Inuktitut.docx differ diff --git a/data/gdrive/Unimorph.docx b/data/gdrive/Unimorph.docx index fcc5a6b..09a12ae 100644 Binary files a/data/gdrive/Unimorph.docx and b/data/gdrive/Unimorph.docx differ diff --git a/data/gdrive/Vocabulary_Tests_Evaluation.xlsx b/data/gdrive/Vocabulary_Tests_Evaluation.xlsx index e924dfa..d19911d 100644 Binary files a/data/gdrive/Vocabulary_Tests_Evaluation.xlsx and b/data/gdrive/Vocabulary_Tests_Evaluation.xlsx differ diff --git a/minutes/01_12_2021.docx b/minutes/01_12_2021.docx index 8e76859..2903c50 100644 Binary files a/minutes/01_12_2021.docx and b/minutes/01_12_2021.docx differ diff --git a/minutes/02_02_2021.docx b/minutes/02_02_2021.docx index 73ed2f6..846dfee 100644 Binary files a/minutes/02_02_2021.docx and b/minutes/02_02_2021.docx differ diff --git a/minutes/02_04_2019.docx b/minutes/02_04_2019.docx index e424658..d332041 100644 Binary files a/minutes/02_04_2019.docx and b/minutes/02_04_2019.docx differ diff --git a/minutes/02_07_2019.docx b/minutes/02_07_2019.docx index 9ac005d..ab2973b 100644 Binary files a/minutes/02_07_2019.docx and b/minutes/02_07_2019.docx differ diff --git a/minutes/03_03_2021.docx b/minutes/03_03_2021.docx index 0544d35..c88efa1 100644 Binary files a/minutes/03_03_2021.docx and b/minutes/03_03_2021.docx differ diff --git a/minutes/03_05_2023.docx b/minutes/03_05_2023.docx new file mode 100644 index 0000000..0586de8 Binary files /dev/null and b/minutes/03_05_2023.docx differ diff --git a/minutes/03_11_2021.docx b/minutes/03_11_2021.docx index ae1343e..05f5744 100644 Binary files a/minutes/03_11_2021.docx and b/minutes/03_11_2021.docx differ diff --git a/minutes/04_02_2020.docx b/minutes/04_02_2020.docx index 8181459..e6fcc15 100644 Binary files a/minutes/04_02_2020.docx and b/minutes/04_02_2020.docx differ diff --git a/minutes/04_05_2022.docx b/minutes/04_05_2022.docx index 9ee1a48..8b2381c 100644 Binary files a/minutes/04_05_2022.docx and b/minutes/04_05_2022.docx differ diff --git a/minutes/04_08_2021.docx b/minutes/04_08_2021.docx index c85ad0b..b7ab8be 100644 Binary files a/minutes/04_08_2021.docx and b/minutes/04_08_2021.docx differ diff --git a/minutes/04_10_2023.docx b/minutes/04_10_2023.docx new file mode 100644 index 0000000..5078eea Binary files /dev/null and b/minutes/04_10_2023.docx differ diff --git a/minutes/04_12_2018.docx b/minutes/04_12_2018.docx index 2a19f98..d218763 100644 Binary files a/minutes/04_12_2018.docx and b/minutes/04_12_2018.docx differ diff --git a/minutes/05_02_2019.docx b/minutes/05_02_2019.docx index 02c8dd9..7157348 100644 Binary files a/minutes/05_02_2019.docx and b/minutes/05_02_2019.docx differ diff --git a/minutes/05_03_2019.docx b/minutes/05_03_2019.docx index 8a8dc9c..28195f5 100644 Binary files a/minutes/05_03_2019.docx and b/minutes/05_03_2019.docx differ diff --git a/minutes/05_04_2023.docx b/minutes/05_04_2023.docx new file mode 100644 index 0000000..6413750 Binary files /dev/null and b/minutes/05_04_2023.docx differ diff --git a/minutes/05_10_2022.docx b/minutes/05_10_2022.docx index 2a7c4b4..a21673a 100644 Binary files a/minutes/05_10_2022.docx and b/minutes/05_10_2022.docx differ diff --git a/minutes/06_04_2022.docx b/minutes/06_04_2022.docx index cd9fbaa..999324d 100644 Binary files a/minutes/06_04_2022.docx and b/minutes/06_04_2022.docx differ diff --git a/minutes/06_10_2021.docx b/minutes/06_10_2021.docx index bf4bc37..997d1d9 100644 Binary files a/minutes/06_10_2021.docx and b/minutes/06_10_2021.docx differ diff --git a/minutes/07_01_2019.docx b/minutes/07_01_2019.docx index f42f532..d6c77bd 100644 Binary files a/minutes/07_01_2019.docx and b/minutes/07_01_2019.docx differ diff --git a/minutes/07_01_2020.docx b/minutes/07_01_2020.docx index edd429b..1702725 100644 Binary files a/minutes/07_01_2020.docx and b/minutes/07_01_2020.docx differ diff --git a/minutes/07_07_2021.docx b/minutes/07_07_2021.docx index 26d19b7..d4f4411 100644 Binary files a/minutes/07_07_2021.docx and b/minutes/07_07_2021.docx differ diff --git a/minutes/07_09_2022.docx b/minutes/07_09_2022.docx index 5f13a95..b5f301a 100644 Binary files a/minutes/07_09_2022.docx and b/minutes/07_09_2022.docx differ diff --git a/minutes/08_02_2023.docx b/minutes/08_02_2023.docx index d51473c..af83a59 100644 Binary files a/minutes/08_02_2023.docx and b/minutes/08_02_2023.docx differ diff --git a/minutes/08_03_2023.docx b/minutes/08_03_2023.docx index 6539590..3ec5952 100644 Binary files a/minutes/08_03_2023.docx and b/minutes/08_03_2023.docx differ diff --git a/minutes/08_09_2021.docx b/minutes/08_09_2021.docx index 68b222a..0dd90aa 100644 Binary files a/minutes/08_09_2021.docx and b/minutes/08_09_2021.docx differ diff --git a/minutes/09_02_2022.docx b/minutes/09_02_2022.docx index a67599a..cd5588e 100644 Binary files a/minutes/09_02_2022.docx and b/minutes/09_02_2022.docx differ diff --git a/minutes/09_03_2022.docx b/minutes/09_03_2022.docx index 34711b0..1729e78 100644 Binary files a/minutes/09_03_2022.docx and b/minutes/09_03_2022.docx differ diff --git a/minutes/09_04_2019.docx b/minutes/09_04_2019.docx index a3c1f96..fcd5803 100644 Binary files a/minutes/09_04_2019.docx and b/minutes/09_04_2019.docx differ diff --git a/minutes/09_06_2021.docx b/minutes/09_06_2021.docx index 25e4fc9..597c7ae 100644 Binary files a/minutes/09_06_2021.docx and b/minutes/09_06_2021.docx differ diff --git a/minutes/10_09_2019.docx b/minutes/10_09_2019.docx index 06a1d0b..fdadcaa 100644 Binary files a/minutes/10_09_2019.docx and b/minutes/10_09_2019.docx differ diff --git a/minutes/10_12_2019.docx b/minutes/10_12_2019.docx index 3609f48..6ae22f4 100644 Binary files a/minutes/10_12_2019.docx and b/minutes/10_12_2019.docx differ diff --git a/minutes/11_01_2023.docx b/minutes/11_01_2023.docx index 2c27065..b8d8b5e 100644 Binary files a/minutes/11_01_2023.docx and b/minutes/11_01_2023.docx differ diff --git a/minutes/11_06_2019.docx b/minutes/11_06_2019.docx index 122bb10..7c0aac2 100644 Binary files a/minutes/11_06_2019.docx and b/minutes/11_06_2019.docx differ diff --git a/minutes/11_12_2018.docx b/minutes/11_12_2018.docx index 0c33bd4..ed90abd 100644 Binary files a/minutes/11_12_2018.docx and b/minutes/11_12_2018.docx differ diff --git a/minutes/12_01_2022.docx b/minutes/12_01_2022.docx index 00d03fd..82c8f2b 100644 Binary files a/minutes/12_01_2022.docx and b/minutes/12_01_2022.docx differ diff --git a/minutes/12_02_2019.docx b/minutes/12_02_2019.docx index 3bbadd3..afd9a60 100644 Binary files a/minutes/12_02_2019.docx and b/minutes/12_02_2019.docx differ diff --git a/minutes/12_03_2019.docx b/minutes/12_03_2019.docx index 777ef10..ed86baa 100644 Binary files a/minutes/12_03_2019.docx and b/minutes/12_03_2019.docx differ diff --git a/minutes/12_05_2021.docx b/minutes/12_05_2021.docx index 767f39e..a6d976f 100644 Binary files a/minutes/12_05_2021.docx and b/minutes/12_05_2021.docx differ diff --git a/minutes/12_05_2021_Chairs_meeting.docx b/minutes/12_05_2021_Chairs_meeting.docx index 43361df..72d444c 100644 Binary files a/minutes/12_05_2021_Chairs_meeting.docx and b/minutes/12_05_2021_Chairs_meeting.docx differ diff --git a/minutes/12_07_2023.docx b/minutes/12_07_2023.docx new file mode 100644 index 0000000..51b6af3 Binary files /dev/null and b/minutes/12_07_2023.docx differ diff --git a/minutes/12_11_2019.docx b/minutes/12_11_2019.docx index 3f7b0c3..cce4240 100644 Binary files a/minutes/12_11_2019.docx and b/minutes/12_11_2019.docx differ diff --git a/minutes/13_07_2022.docx b/minutes/13_07_2022.docx index 59f6b6e..8b4105a 100644 Binary files a/minutes/13_07_2022.docx and b/minutes/13_07_2022.docx differ diff --git a/minutes/14_04_2021.docx b/minutes/14_04_2021.docx index f9ec66c..b209c8f 100644 Binary files a/minutes/14_04_2021.docx and b/minutes/14_04_2021.docx differ diff --git a/minutes/15_01_2019.docx b/minutes/15_01_2019.docx index ef615dd..d4ec20a 100644 Binary files a/minutes/15_01_2019.docx and b/minutes/15_01_2019.docx differ diff --git a/minutes/15_06_2022.docx b/minutes/15_06_2022.docx index 0946009..aceee2b 100644 Binary files a/minutes/15_06_2022.docx and b/minutes/15_06_2022.docx differ diff --git a/minutes/15_10_2019.docx b/minutes/15_10_2019.docx index e45c487..a236c68 100644 Binary files a/minutes/15_10_2019.docx and b/minutes/15_10_2019.docx differ diff --git a/minutes/15_11_2023.docx b/minutes/15_11_2023.docx new file mode 100644 index 0000000..2124d5b Binary files /dev/null and b/minutes/15_11_2023.docx differ diff --git a/minutes/16_04_2019.docx b/minutes/16_04_2019.docx index df2b51a..b54f2dd 100644 Binary files a/minutes/16_04_2019.docx and b/minutes/16_04_2019.docx differ diff --git a/minutes/16_11_2022.docx b/minutes/16_11_2022.docx index e8b5af1..2313f7b 100644 Binary files a/minutes/16_11_2022.docx and b/minutes/16_11_2022.docx differ diff --git a/minutes/17_02_2021.docx b/minutes/17_02_2021.docx index e637304..f8a46be 100644 Binary files a/minutes/17_02_2021.docx and b/minutes/17_02_2021.docx differ diff --git a/minutes/17_03_2021.docx b/minutes/17_03_2021.docx index 2be71f6..909a368 100644 Binary files a/minutes/17_03_2021.docx and b/minutes/17_03_2021.docx differ diff --git a/minutes/17_05_2023.docx b/minutes/17_05_2023.docx new file mode 100644 index 0000000..ff412da Binary files /dev/null and b/minutes/17_05_2023.docx differ diff --git a/minutes/17_11_2021.docx b/minutes/17_11_2021.docx index 034b637..9edb527 100644 Binary files a/minutes/17_11_2021.docx and b/minutes/17_11_2021.docx differ diff --git a/minutes/18_02_2020.docx b/minutes/18_02_2020.docx index 5c421d4..9161a46 100644 Binary files a/minutes/18_02_2020.docx and b/minutes/18_02_2020.docx differ diff --git a/minutes/18_05_2022.docx b/minutes/18_05_2022.docx index b912577..71cff86 100644 Binary files a/minutes/18_05_2022.docx and b/minutes/18_05_2022.docx differ diff --git a/minutes/18_10_2023.docx b/minutes/18_10_2023.docx new file mode 100644 index 0000000..01f19e7 Binary files /dev/null and b/minutes/18_10_2023.docx differ diff --git a/minutes/19_03_2019.docx b/minutes/19_03_2019.docx index f41a3f8..a7e784b 100644 Binary files a/minutes/19_03_2019.docx and b/minutes/19_03_2019.docx differ diff --git a/minutes/19_04_2023.docx b/minutes/19_04_2023.docx new file mode 100644 index 0000000..9510bef Binary files /dev/null and b/minutes/19_04_2023.docx differ diff --git a/minutes/19_10_2022.docx b/minutes/19_10_2022.docx index 7fda6b6..70e92ce 100644 Binary files a/minutes/19_10_2022.docx and b/minutes/19_10_2022.docx differ diff --git a/minutes/20_01_2021.docx b/minutes/20_01_2021.docx index 344d64c..bc0dcaa 100644 Binary files a/minutes/20_01_2021.docx and b/minutes/20_01_2021.docx differ diff --git a/minutes/20_04_2022.docx b/minutes/20_04_2022.docx index 2e41ec3..01987a6 100644 Binary files a/minutes/20_04_2022.docx and b/minutes/20_04_2022.docx differ diff --git a/minutes/20_09_2023.docx b/minutes/20_09_2023.docx new file mode 100644 index 0000000..43b2931 Binary files /dev/null and b/minutes/20_09_2023.docx differ diff --git a/minutes/20_10_2021.docx b/minutes/20_10_2021.docx index a0a906d..10eea57 100644 Binary files a/minutes/20_10_2021.docx and b/minutes/20_10_2021.docx differ diff --git a/minutes/20_11_2018.docx b/minutes/20_11_2018.docx index 00b4840..990febc 100644 Binary files a/minutes/20_11_2018.docx and b/minutes/20_11_2018.docx differ diff --git a/minutes/21_01_2020.docx b/minutes/21_01_2020.docx index 124dc41..b38628f 100644 Binary files a/minutes/21_01_2020.docx and b/minutes/21_01_2020.docx differ diff --git a/minutes/21_07_2021.docx b/minutes/21_07_2021.docx index 8f83a00..faf5fe2 100644 Binary files a/minutes/21_07_2021.docx and b/minutes/21_07_2021.docx differ diff --git a/minutes/21_12_2022.docx b/minutes/21_12_2022.docx index 6b263d6..efc7dbe 100644 Binary files a/minutes/21_12_2022.docx and b/minutes/21_12_2022.docx differ diff --git a/minutes/22_01_2019.docx b/minutes/22_01_2019.docx index f38a9ab..3eadbca 100644 Binary files a/minutes/22_01_2019.docx and b/minutes/22_01_2019.docx differ diff --git a/minutes/22_02_2023_MWE_only.docx b/minutes/22_02_2023_MWE_only.docx index 325c826..44d68ab 100644 Binary files a/minutes/22_02_2023_MWE_only.docx and b/minutes/22_02_2023_MWE_only.docx differ diff --git a/minutes/22_03_2023.docx b/minutes/22_03_2023.docx index 1c3f60e..4514293 100644 Binary files a/minutes/22_03_2023.docx and b/minutes/22_03_2023.docx differ diff --git a/minutes/22_09_2021.docx b/minutes/22_09_2021.docx index 5f7273a..8090287 100644 Binary files a/minutes/22_09_2021.docx and b/minutes/22_09_2021.docx differ diff --git a/minutes/23_02_2022.docx b/minutes/23_02_2022.docx index 14ffbc9..f002903 100644 Binary files a/minutes/23_02_2022.docx and b/minutes/23_02_2022.docx differ diff --git a/minutes/23_03_2022.docx b/minutes/23_03_2022.docx index c887d4c..6161b92 100644 Binary files a/minutes/23_03_2022.docx and b/minutes/23_03_2022.docx differ diff --git a/minutes/23_04_2019.docx b/minutes/23_04_2019.docx index b395a56..ddce6d2 100644 Binary files a/minutes/23_04_2019.docx and b/minutes/23_04_2019.docx differ diff --git a/minutes/23_06_2021.docx b/minutes/23_06_2021.docx index 2d2e109..0c97b6c 100644 Binary files a/minutes/23_06_2021.docx and b/minutes/23_06_2021.docx differ diff --git a/minutes/24_09_2019.docx b/minutes/24_09_2019.docx index 1a9ef55..0e0061b 100644 Binary files a/minutes/24_09_2019.docx and b/minutes/24_09_2019.docx differ diff --git a/minutes/25_06_2019.docx b/minutes/25_06_2019.docx index 093e190..90d7504 100644 Binary files a/minutes/25_06_2019.docx and b/minutes/25_06_2019.docx differ diff --git a/minutes/26_01_2022.docx b/minutes/26_01_2022.docx index c470988..631c640 100644 Binary files a/minutes/26_01_2022.docx and b/minutes/26_01_2022.docx differ diff --git a/minutes/26_02_2019.docx b/minutes/26_02_2019.docx index 067cf4c..c1c83ae 100644 Binary files a/minutes/26_02_2019.docx and b/minutes/26_02_2019.docx differ diff --git a/minutes/26_03_2019.docx b/minutes/26_03_2019.docx index 166d938..dff01bc 100644 Binary files a/minutes/26_03_2019.docx and b/minutes/26_03_2019.docx differ diff --git a/minutes/26_05_2021.docx b/minutes/26_05_2021.docx index 840abb2..9e5dd34 100644 Binary files a/minutes/26_05_2021.docx and b/minutes/26_05_2021.docx differ diff --git a/minutes/26_07_2023.docx b/minutes/26_07_2023.docx new file mode 100644 index 0000000..474c504 Binary files /dev/null and b/minutes/26_07_2023.docx differ diff --git a/minutes/27_08_2019.docx b/minutes/27_08_2019.docx index 5e39b8b..8821709 100644 Binary files a/minutes/27_08_2019.docx and b/minutes/27_08_2019.docx differ diff --git a/minutes/27_07_2022.docx b/minutes/27_09_2022.docx similarity index 98% rename from minutes/27_07_2022.docx rename to minutes/27_09_2022.docx index 8f40092..a63b2e0 100644 Binary files a/minutes/27_07_2022.docx and b/minutes/27_09_2022.docx differ diff --git a/minutes/27_11_2018.docx b/minutes/27_11_2018.docx index 4ce7a1d..f87f835 100644 Binary files a/minutes/27_11_2018.docx and b/minutes/27_11_2018.docx differ diff --git a/minutes/29_06_2022.docx b/minutes/29_06_2022.docx index 38d29b9..f77b41e 100644 Binary files a/minutes/29_06_2022.docx and b/minutes/29_06_2022.docx differ diff --git a/minutes/2_11_2022.docx b/minutes/2_11_2022.docx index b2db5e5..1435293 100644 Binary files a/minutes/2_11_2022.docx and b/minutes/2_11_2022.docx differ diff --git a/minutes/30_04_2019.docx b/minutes/30_04_2019.docx index 7c212f9..3506b9a 100644 Binary files a/minutes/30_04_2019.docx and b/minutes/30_04_2019.docx differ diff --git a/minutes/31_03_2021.docx b/minutes/31_03_2021.docx index 7a181b3..8864084 100644 Binary files a/minutes/31_03_2021.docx and b/minutes/31_03_2021.docx differ diff --git a/minutes/Update_for_Maltese_Data.docx b/minutes/Update_for_Maltese_Data.docx index 9aaeb70..c0ad178 100644 Binary files a/minutes/Update_for_Maltese_Data.docx and b/minutes/Update_for_Maltese_Data.docx differ diff --git a/minutes/first_evaluation_telco.docx b/minutes/first_evaluation_telco.docx index b8baf77..599d687 100644 Binary files a/minutes/first_evaluation_telco.docx and b/minutes/first_evaluation_telco.docx differ diff --git a/minutes_txt/01_12_2021.docx.txt b/minutes_txt/01_12_2021.docx.txt index b85f567..1c7865b 100644 --- a/minutes_txt/01_12_2021.docx.txt +++ b/minutes_txt/01_12_2021.docx.txt @@ -17,7 +17,7 @@ Stefania Racioppa (DFKI) 1. Module draft 4.7 -[image2] +[image1] Adaptations included into module draft 4.7: @@ -252,7 +252,7 @@ What was aimed to be represented: □ triples: ontolex:Form morph:consistsOf morph:Morph -[image1] +[image2] • BK: can we modify this diagram to include ontolex:LexicalEntry diff --git a/minutes_txt/03_05_2023.docx.txt b/minutes_txt/03_05_2023.docx.txt new file mode 100644 index 0000000..8636b4d --- /dev/null +++ b/minutes_txt/03_05_2023.docx.txt @@ -0,0 +1,427 @@ +Link: https://meet.google.com/nsj-tbcy-yop [CHECK HERE FOR UPDATED LINK(S)] + +Latest Definitions: https://github.com/ontolex/morph/blob/master/draft.md + +Nexus: https://nexuslinguarum.eu/the-action/join-us + +Participants [please add yourself]: + +Christian Chiarcos (CC) (excused for being 10 min late) + +Max Ionov (MI) + +Katerina Gkirtzou (KG) + +Besim Kabashi (BK) + +Fahad Khan (FK) + +Khadija Ait ElFqih (KAE) + +Matteo Pellegrini (MP) + +Ciprian-Octavian Truică (CT) + +Penny Labropoulou (PL) + +Elena Simona Apostol (ESA) + +Sina Ahmadi (SA) + +Elena Benzoni (EB) + +Petra Steiner (PS) + +Theodorus Fransen (TF) + +Thierry Declerck (DFKI) + +Ranka Stanković (RS) + +Gilles Sérasset (GS) + +Mike Rosner (MR) + +Table of Contents + +0. Module draft (4.17) + +1. Semitic Roots + +2. Morph ordering + +3. Open topics (from 2023-04-05) + +POSTPONED + + Next time + +0. Module draft (4.17) + +[image1] + +1. Semitic Roots + + • + 2023-04-19 + + □ + Suggestion (MI): Roots are morphs (lexinfo:RootMorph), canonical + forms specified for a lexical entry consist of them. + + • + Example (with minor comments) + + roots:k-t-b a lexinfo:RootMorph, ontolex:LexicalEntry ; + + ontolex:evokes :k-t-b_meaning; + + rdfs:label "k-t-b" . + + :k-t-b_meaning a ontolex:LexicalConcept. + + # not: LexicalSense + + :kiteb a ontolex:Word ; + + lexinfo:partOfSpeech lexinfo:verb ; + + morph:morphologicalPattern ; + + ontolex:canonicalForm ; + + morph:baseForm . + + a ontolex:Form ; + + morph:consistsOf roots:k-t-b ; + + ontolex:writtenRep "kiteb"@mlt ; + + ontolex:phoneticRep "/kɪtɛp/" . + + • + Approved by participants + + • + Tbc: solved? + + □ + Consensus: yes + + □ + TODO: add as example to draft + + • + Do we need unordered along with numbered properties for consistsOf ? + + □ + Consensus: yes + + • + Can we abandon consistsOf in favour of rdfs:member + + □ + Consensus: no (because it’s more established, it might have special + semantics [which is to be confirmed], and this double modelling has + precedent in OntoLex-Decomp) + + • + No lexical senses for RootMorph + + □ + TODO: put into RootMorph definition that the set of lexical senses + for RootMorph MUST be empty because root morph are not independent + lexemes but used in different lexemes with different senses. + + ☆ + otherwise, the ontolex definition may lead people to use + lexical sense here, because the root morph actually *is* a + single lexical entry. + + • + RootMorph rdfs:subClassOf morph:Morph? + + □ + KG: Using the morph:consistsOf here, implies (by inference) that + the roots:k-t-b is a morph:Morph. Do we want this inference? + Meaning does linguistically make sense? If it is always true I + would propose to declare lexinfo:RootMorph as a subclass of + morph:Morph. I have a similar question/issue regarding + lexinfo:StemMorph + + □ + CC: yes, this is the intuition, but in the current proposal, this + is not done (https://github.com/ontolex/lexinfo/pull/29/commits/ + c50438a57e33152586bfc38cef73ef4a1b5535f0), because the Morph + vocabulary is unpublished and we shouldn’t contaminate Lexinfo with + premature stuff. + + => TODO: fix/update pull request after Morph publication + + • + lexinfo:etymologicalRoot + + □ + 2023-04-19, FK: lexinfo:etymologicalRoot: lexical entry -> lexical + entry; “Morpheme that has a particular status with regards to the + word's etymology.” can be used to link lexical entries (!) with a + lexinfo:RootMorph + + • + Tbc: is that enough or do we need another relation? + + □ + Confirmed, yes + +2. Morph ordering + +One of the requirements. + + • + RDF, unordered: morph:consistsOf , + + □ + NB: for circumflexes, we either need to decompose or just to + provide morph:consistsOf + + • + It already is in the diagram: diamond-symbol operator (that’s standing + for rdfs:Seq) + + :x morph:consistsOf :m1, :m2. + + :x a rdfs:Seq; rdf:_1 :m1; rdf:_2 :m2. + + • + KG: + + □ + x is the ontolex:Form right? I thought that the rdfs:Seq implied + from the diamond in the diagram would result in something like + this, no? + + :x a ontolex:Form ; + + morph:consistsOf [ + + a rdfs:Seq ; + + rdf:_1 :y1; + + rdf:_2 :y2; + + ] + + . + + :y1 a morph:Morph . + + :y2 a morph:Morph . + + • + That was also Matteo’s intuitive interpretation + + • + CC: modelling inspired by the usage of morph:Morph for morph(ological + segment)s in interlinear glossed text + + • + This was not the intended modelling, indeed, but it has advantages + + □ + PRO: multiple segmentations + + □ + PRO: defining form as a seq of morph(eme)s may be counterintuitive + to linguists + + □ + CONTRA: Could make the diagram quite bulky. Need to create another + aggregator class + + □ + CONTRA: we cannot define RDFS constraints on the components of a + container + + □ + CONTRA: consistsOf should then always point to Seqs + + • + To be further discussed + + • + Alternatively, we could use rdf:List (not discussed) + + Turtle: + + :x owl:sameAs ( :m1, :m2 ). + +equivalent to: + +:x a rdf:List; + + rdf:first :m1; + + rdf:rest [ + + a rdf:List; + + rdf:first :m2; + + rdf:rest rdf:nil ] . + +rdf:List have a very nice Turtle syntax, but are much harder to query. + +previously, we had a consensus against rdf:List, also following other OntoLex + +modules (esp., decomp) + +TBC: do we stay with rdfs:Seq/rdfs:Bag? + + • + Tbc: this is solved, right? + + □ + No, we might need to reconsider to follow the modelling along the + lines of katerina’s proposal, because that allows for multiple + different segmentations + +3. Open topics (from 2023-04-05) + + • + morph:baseConstraint + + □ + todo@kg: more comprehensible definition needed + + • + MP: Naming “paradigm” may evoke incorrect associations, definition of + paradigm in linguistics as a set of word forms + + Note: we also need to rephrase the “paradigm” property, then + + • + morphological pattern + + • + Inflection type? + + • + Inflection class? + + □ + MP: +1 + + □ + GS: sounds like owl class + + • + Inflection? + + => hard to get to a consensus, vote? + +POSTPONED + + • + MP: generates to lexical forms? + + • + Besim: Clitics, e.g., Macedonian + + □ + Max: idea was to treat them like separable prefixes in German, it’s + ok to have a space in a written representation of a form + + □ + TO-BE-DONE: find example data of lexical entries in 5 languages + + ☆ + Ranka: Samples for Serbian (link below) + + • + Character / sound classes + + □ + Options: either within morph or within the scope of a new module? + + ☆ + If the latter: + + ○ + together with signs/multimodality? + + ○ + together with other transformation rules (e.g., for + transcription)? + + ■ + Could also include finite state terminology + + ■ + Could also include diachrony + + • + Finite state terminology + + □ + Options: either within morph or within the scope of a new module? + + ☆ + If the latter: + + ○ + Together with diachrony? + + ○ + Together with sound classes? + + • + MWEs + + □ + 2023-04-05 Gilles: describe decomposition of MWEs in DBnary w. + Thierry + + • + Non-concatenative morphology + + □ + Hindi/Urdu example, e.g., https://en.wiktionary.org/wiki/ + %E0%A4%96%E0%A5%81%E0%A4%B2%E0%A4%A8%E0%A4%BE#Hindi (Fahad) + + • + TO-BE-DONE@Katerina: + + □ + list challenges from LDL + + □ + Tbc: are they still challenges or not, if so, do they fall under + morph? + + • + GitHub issues + + □ + https://github.com/ontolex/morph/pull/16 + + □ + https://github.com/ontolex/morph/issues + + • + Wiki: requirements (see 2023-04-05) + + • + github/gdrive: data sets (see 2023-04-05) + + • + comments from draft + + □ + https://github.com/ontolex/morph/blob/master/draft.md + +Next time + +Discussing problems with Old Irish in its representation in morph + diff --git a/minutes_txt/03_11_2021.docx.txt b/minutes_txt/03_11_2021.docx.txt index 321177b..9bdbf05 100644 --- a/minutes_txt/03_11_2021.docx.txt +++ b/minutes_txt/03_11_2021.docx.txt @@ -29,7 +29,7 @@ Penny Labropoulou (PL) computer). Type represents the linguistic process (add -(e)r). Rule is a computational implementation (add "r" to end of string). -[image2] +[image3] CC tried to convert data with 4.5.1, evaluation: @@ -67,7 +67,7 @@ acoli-repo/acoli-morph/blob/main/uder: • -[image1] +[image2] CC tried to convert data with 4.5.2, evaluation: @@ -101,7 +101,7 @@ lexico-semantic relation as a 'source'. added object property morph:contains (for example) between morph:WordFormationRelation and morph:Morph -[image3] +[image1] Adaptations of module draft 4.5 to be included for next telco: diff --git a/minutes_txt/04_10_2023.docx.txt b/minutes_txt/04_10_2023.docx.txt new file mode 100644 index 0000000..5f34f33 --- /dev/null +++ b/minutes_txt/04_10_2023.docx.txt @@ -0,0 +1,88 @@ +Link: https://meet.google.com/nsj-tbcy-yop [CHECK HERE FOR UPDATED LINK(S)] + +Latest Definitions: https://github.com/ontolex/morph/blob/master/draft.md + +Nexus: https://nexuslinguarum.eu/the-action/join-us + +Participants [please add yourself]: + +Christian Chiarcos (CC) + +Max Ionov (MI) + +Katerina Gkirtzou (KG) + +Besim Kabashi (BK) + +Fahad Khan (FK) + +Khadija Ait ElFqih (KAE) + +Matteo Pellegrini (MP) + +Ciprian-Octavian Truică (CT) + +Penny Labropoulou (PL) + +Elena Simona Apostol (ESA) + +Sina Ahmadi (SA) + +Elena Benzoni (EB) + +Petra Steiner (PS) + +Theodorus Fransen (TF) + +Ranka Stanković (RS) + +Gilles Sérasset (GS) + +Mike Rosner (MR) + +Table of Contents + +0. Module draft (4.17) + +1. Requirements + +2. Datasets + +3. AoB + +0. Module draft (4.17) + +[image1] + +1. Replacement → RegexReplacement + +FK: morph:source and morph:target being string literals can be too strictly +defined given that the model will not be able to change after publication. What +if we relax this restriction, move specifics about regular expressions into +addenda. + +… some discussion … + +MI: Suggestion — having morph:Replacement underdefined, a new subclass +morph:RegexReplacement which has the properties morph:Replacement has now. + +2. Plans + +Next time (04.10): + + • + Everyone whose issues are on GH are either present or we know the + status + + • + Going through unmet requirements + + • + Going through problematic datasets + +Time after next (18.10): + + • + MP presents the new standard for morphology, its compatibility with + Morph. + diff --git a/minutes_txt/05_04_2023.docx.txt b/minutes_txt/05_04_2023.docx.txt new file mode 100644 index 0000000..a4b45bd --- /dev/null +++ b/minutes_txt/05_04_2023.docx.txt @@ -0,0 +1,1308 @@ +Link: https://meet.google.com/unn-ofrg-hdb [one-time-link; check here for link +updates if it doesn’t work] + +next : https://meet.google.com/erm-dktq-knp + +Latest Definitions: https://github.com/ontolex/morph/blob/master/draft.md + +Nexus: https://nexuslinguarum.eu/the-action/join-us + +Participants [please add yourself]: + +Christian Chiarcos (CC) + +Max Ionov (MI) + +Katerina Gkirtzou (KG) + +Besim Kabashi (BK) + +Fahad Khan (FK) + +Khadija Ait ElFqih (KAE) + +Matteo Pellegrini (MP) + +Ciprian-Octavian Truică (CT) + +Penny Labropoulou (PL) + +Elena Simona Apostol (ESA) + +Sina Ahmadi (SA) + +Elena Benzoni (EB) + +Petra Steiner (PS) + +Theodorus Fransen (TF) + +Thierry Declerck (DFKI) + +Ranka Stanković (RS) + +Gilles Sérasset (GS) + +Mike Rosner (MR) + +Table of Contents + +0. Module draft (4.17) + +Publications + + MWE Chapter + + UniDive + + LDK workshop + +Requirement freeze + + Status + + Current open topics + + Datasets to be addressed + +Character / sound classes + +morph:baseConstraint + +LDL challenges + +Semitic languages + +Requirements (from wiki) + +Appendix: Organizational + + Procedure for Feature freeze + +0. Module draft (4.17) + +[image1] + +No changes + +Question: What is the difference between baseConstraint and grammaticalMeaning? + + • + TODO@Katerina: find a new wording in the definition ;) + + • + CC: original idea + + □ + Derivational (and inflection) morphemes can only be applied under + certain conditions, base constraint is the precondition for a + morpheme to apply, grammatical meaning is a feature of the derived + (inflected) form/lexical entry, e.g., + +Publications + +MWE Chapter + + • + Submitted: https://www.overleaf.com/8285444258rpfnbwgwbrdp + + • + Feedback delayed + +UniDive + + • + Christian has been asked to co-lead a task on modelling MWE + dictionaries in Cost Action UniDive, Fahad would like to contribute ppt + googledoc for wg2 meeting + + • + WG2 (corpus-lexicon interface) call tomorrow + + □ + Todo: add link to mailing list registration and registration + procedure + +LDK workshop + + • + Prep call next week: April 12 + + • + TODO@CC: dig out internal planning document (until 16:00) + + • + TODO@FK: contact people (today) + + • + TODO@CC: contact everyone else not reached by today (LD4LT, OntoLex + chairs) + +Last time: + + • + Fahad worked on the website + + • + Call for participation, program needed + + • + In contact with sara caravalho (on terminology) + + • + To figure out how to do registration + + • + Sep 12, full-day + + • + Catering in doubt + + • + There will be parallel events + + □ + Not online, yet + + □ + proling knower (half-day) and the Frame Net (half day) and + Disinformation and Toxic Content analysis (full day) will be in + parallel on the 12th + + □ + proling morning, framenet afternoon + +=> so any synsem stuff in our workshop into the morning and any metadata into +the afternoon + +Requirement freeze + +Status + + • + requirement/dataset freeze as of March 22, 2023 + + • + Last-minute additions (March 22) + + □ + Old Irish + + □ + nothing else + +Current open topics + + • + Character / sound classes + + • + morph:baseConstraint + + • + Finite state terminology + + • + Issues with semitic? + + • + MP: Naming “paradigm” => morphological patterns? Inflection type? + Inflection Class? + + • + Wiki: requirements (see below) + + • + github/gdrive: data sets (see below) + + • + GitHub issues, comments from draft + + • + Matteo: generates to lexical forms + + • + Katerina: discuss the challenges from LDL if they are still challenges + or not, whether they fall under morph. => separate section + + • + Besim: Clitics, e.g., Macedonian + + □ + Max: idea was to treat them like separable prefixes in German, it’s + ok to have a space in a written representation of a form + + □ + TODO: find example data of lexical entries in 5 languages + + ☆ + Ranka: Samples for Serbian (link below) + + • + MWEs + + □ + As addressed in chapter + + □ + Gilles: describe decomposition of MWEs in DBnary w. Thierry + + • + Gilles: new derivational morphology for French + + □ + Technically, this is too late, but if there is a volunteer to take + a look, we can take that feedback into account + + • + Non-concatenative morphology + + □ + Hindi/Urdu example, e.g., https://en.wiktionary.org/wiki/ + %E0%A4%96%E0%A5%81%E0%A4%B2%E0%A4%A8%E0%A4%BE#Hindi (Fahad) + +Datasets to be addressed + +After the requirement freeze, this should be data previously discussed here or +in joint papers. + +NB: We don’t discuss the datasets per se, only the problems they pose to the +model as of now. So the process is the following: + + 1. + People responsible for the dataset (possibly, with the help of others) + model the problematic part in OntoLex-Morph (example, not full dataset) + offline + + a. + For problems in modelling or finding information: reach out to + moderators or mailing list (or Nexus Slack) + + 2. + Deposit data in GitHub (or contact moderators) + + 3. + Readme in GitHub: describe problematic points (if any) + + a. + Problematic points can be: something you cannot model or something + you made a design decision that you’d like to discuss + + 4. + There should be a session devoted to each of the problematic datasets, + presented by a person responsible for the dataset; + + 5. + Datasets without problems (or readmes) are not presented, but can be + used as examples when writing up the specification. + + • + Clitics examples: find example data of lexical entries in ~5 languages + (TODO, partially on Besim) + + □ + E.g., Old Irish (TF). See here. The contents in the document and + accompanying spreadsheet are possibly equally (or more) pertinent + to “Current open topics” above. + + □ + Greek (Penny) + + □ + Serbian (Ranka) + + □ + Romanian (Elena & Ciprian) / Italian (?) + + ☆ + TODO@EA: particular examples for each clitic, not finished yet + + ☆ + Fahad on Italian: clitic example in Johanna and Maria Pia + Buono's poster@unidive, https://unidive.lisn.upsaclay.fr/lib/ + exe/fetch.php?media= + meetings:2023-saclay:abstracts:wg2-3-monti-dibuono-poster.pdf, + in Ontolex on the poster + + □ + Any non-IE? E.g., as used in Toolbox data (TODO@Max) + + ☆ + Max: looking into this, no problematic cases so far + + ☆ + Written representations can contain spaces + + ☆ + Elena: I actually hadn't been thinking about italian, but maybe + I have a suggestion about non-IE languages. Austronesian + languages have clitics (for example samoan), I'm looking for + dictionaries to check if clitics appear in lexical entries + + • + Maltese (Mike): sample + + • + Serbian (Ranka): paper for LDK - football use case, work in progress + ttl (work in progress) + + • + German (Petra Steiner)? + + □ + CELEX + + • + Old Irish (Dorus, illustrative examples needed, cf. here mentioned + under clitics above) + + □ + mutations + + □ + Stem allomorphy / stem alternation + + □ + Polysynthetic constructions + + • + https://github.com/ontolex/morph/tree/master/data + + □ + agglutinating/Turkish (Christian) + + □ + foma/quechua (to be skipped, suggested by Christian) + + □ + fusional/lexis (Mod. Greek, Katerina/Penny) + + □ + gdrive/Latin_Word_Formation (Matteo) + + □ + gdrive/Agglutinative_Sumerian.docx (Christian) + + □ + gdrive/Composition_Derivation_Old_High_German.docx (Christian) + + □ + gdrive/Italia_wordform_generation_Stefania_24062021.docx (Stefania + Raciotta) + + □ + gdrive/Italian.docx (?) + + □ + gdrive/Polysynthetic_Inuktitut.docx (Christian) + + □ + gdrive/Unimorph.docx (Christian) + + • + data on Latin inflection (Matteo Pellegrini) + + □ + TODO: link + + • + All data from our papers + + □ + LiLa: inflectional data + + □ + https://github.com/acoli-repo/acoli-morph (German, incl. UniMorph, + UniDer, FST) + + □ + ?Data for the 2017 paper — individual examples from lexicographic + resources, see paper (no additional data apart from what was in it) + + □ + Is there more? + + ☆ + 2017: Ancient Greek, Spanish, one more + + ☆ + 2019: Thierry & Stefania + + • + other data mentioned in minutes on demand + + • + From latest minutes (there is much more, we dig out/add at demand): + + □ + Arabic: Morph Module semitic.odt (Khadija) + + ☆ + Other Semitic: not necessary + + ☆ + examples from Arabic and Hebrew from Ilan + + ○ + TODO@Max: model, and let specialists take a look + + □ + Non-concatenative morphology + + ☆ + Hindi/Urdu example, e.g., https://en.wiktionary.org/wiki/ + %E0%A4%96%E0%A5%81%E0%A4%B2%E0%A4%A8%E0%A4%BE#Hindi (Fahad) + + □ + Tbc: coverage on non-European/non-Indo-European + + ☆ + Wiktionary: which languages? (no description of morphology, but + programs to generate forms, LUA [!], cannot be extracted from + DBnary data) + + • + To be updated: https://github.com/ontolex/morph/blob/master/data/gdrive + /Vocabulary_Tests_Evaluation.xlsx + +Procedure + +the process for datasets is the following: + + 1. + People responsible for the dataset (possibly, with the help of others) + model the problematic part in OntoLex-Morph (example, not full dataset) + offline + + a. + For problems in modelling or finding information: reach out to + moderators or mailing list (or Nexus Slack) + + 2. + Deposit data in GitHub (or contact moderators) + + 3. + Readme in GitHub: describe problematic points (if any) + + a. + Problematic points can be: something you cannot model or something + you made a design decision that you’d like to discuss + + 4. + There should be a session devoted to each of the problematic datasets, + presented by a person responsible for the dataset; + + 5. + Datasets without problems (or readmes) are not presented, but can be + used as examples when writing up the specification. + +As for requirements: try to move as much into offline mode as possible + + • + Tbc. next time (suggestion following) + +FK: examples should reflect non-IE/non-European languages + +CC: maybe two documents: tech spec (minimal examples) and best practices +(prose, more languages) + +To be discussed + +Topics for next time + +Confirm procedure for discussion requirements + +morph:baseConstraint + +OLD: + + • + morph:baseConstraint — found only in one example with Inuktitut + generation. + + In a nutshell, morph:baseConstraint can be used to provide prerequisites + for a morph to be compatible. For example + + :m1 morph:baseConstraint [ :pos "v" ] . + + sets the requirement for the word that this morph can be added to. And + morph:grammaticalMeaning holds grammatical categories for the morph itself, + as before. But shouldn’t it be only for a Rule, not for a Morph? + +?Semitic languages + +DONE: figure out how to include missing categories to LexInfo + +=> vit GitHub issues under https://github.com/ontolex/lexinfo/issues + +OLD: + + Khadija: data prepared for Arabic: + + • + Morph Module semitic.odt + + Necessary features: + + • + lexinfo:POS extensions (solved? See last call minutes for procedure) + + • + Modelling diacritics in Arabic (cf. call minutes last time) + + □ + Also cf. Umlaut in German and vowel harmony in Turkish for similar + challenges + + ☆ + Recommendation (< GS): NFD normalization in morph:Replacement + + ☆ + TBC: are we ok with modelling roots as Morphs (i.e., + LexicalEntries)? + + □ + If not modelled as morphs, then they could be modelled as rules + (replacements) + + ☆ + DONE@CC: model updated + + ○ + morph:grammaticalMeaning and morph:baseConstraint as + properties of morph:Rule, the grammatical meaning is the + change in meaning or morphology of the word (root) + + ☆ + TO-BE-DONE@Khadija: Modelling examples for Arabic entries + + • + Not discussed yet: Circumfix + + □ + Morph:CircumfixParadigm + + □ + Prefix+suffix combination + +List of Requirements from Wiki + +See below + +Topics not for next time + +Character / sound classes [NOT YET] + +OLD: + + Max and Mike, LDK paper on morph on maltese/semitic + + Problem: /(K)([aeiou]{1,2})(K)([e]{1,2})(K)/\1\3i\5t/ + + • + Can be done without, but illegible + + Proposal: to add a class representing a character class (e.g. vowels, + consonants, “sun consonants” — Maltese) + + consonants morph:SoundClass ; + + rdf:label "Sun" ; + + ?:contains "d", "n", “r”, “s”, “t”, “x” . + + consonants morph:SoundClass ; + + rdf:label "Sun" ; + + ?:contains "[dnrstx]" . + + • + Data is the Maltese data discussed here before, will be partially + converted + + • + CC: separate module? Together with signs? + + • + MI: not about phonology, more about representation; workaround: + precompile within rule set + + • + GS: proposal contains two things: + + □ + Mechanism and terminology + + • + TF: what is the usecase of mimicking finite state paradigms + + • + CC: interoperability with dictrionaries and “text book rules” + + • + MI: rules motivated from minimizing the inventory (rules instead of + full forms). That was a requirement from the beginning + + • + Mike: question of what we want morph, necessary for languages where + orthography mapping is not 1:1 + + • + GS: strip out heuristics, focus on description how language changes + strings + + • + MI: not implementing FST, just using the means (a very small part) to + encode rules for generating wordforms + + • + GS: morphology describes transformations, so we need that + + • + CC: let the proposal sink in for a few weeks, if there are additional + use cases that really require this within morph, we could implement it, + the need and motivation is clear. Not fully convinced that modelling + should be restricted to morph + + • + Mike: will be discussed in paper + +LDL challenges [postponed] + +OLD + + • + TODO@Katerina: check draft versions + - Variation in inflexion/Flexemes :: use to dialectal, diachronic or + simply orthographic variation. Examples + + □ + In Latin, lavo `wash can be inflected according to either the 1st + (lavare) + + or the 3rd (lavere) conjugation. + + • + Suppletion, eg in Old English, the verb wesan `to be' whose infinitive + + represents one underlying root, whereas its indicative present singular + forms + + are based on two other roots (eom 1.sg. `(I) am'; bist 2.sg. `(you) are'). + + • + Modern Greek, `τραίνο' and `τρένο' have the same meaning and syntactic + behaviour, so they can be modeled as the same ontolex:LexicalEntry, + where the inflected forms of each are grouped together instead of all + of them being represented as simply ontolex:otherForm. The current + proposal is to introduce a new relation for orthographic variation (\ + onto{lexis:OrthVariant}) as a subclass of \onto + {vartrans:LexicalRelation} and relate the orthographic variants through + this relation. + + - Markers of morphological variation :: labels of style, dating, dialect, + etc. + + • + Resolved by transfering the issue to LexInfo Vocabulary + + • + Note :: it would be desirable if the OntoLex-Morph vocabulary would + eventually be accompanied by best practice recommendations for the + assignment of markers and provenance. + + - Challenges in word formation :: + + • + not fully predictable phonological processes like assimilation or + apophony, which prevent the simple juxtaposition of formative elements + from generating the actual surface form of derivatives; + + • + formal and semantic constraints that make a word formation rule not + applicable to all the words in the lexicon with a specific part of + speech. + + (morph:GrammanticalMeaning??) + +Requirements (from wiki) + + N1: Morph resources + + Description: In order to represent morphemic elements that do not apply to + the restrictive definition of ontolex:Affix as being ontolex:LexicalEntry + resources, a distinct class morphMorph is required as another top-level + class next to ontolex:LexicalEntry and ontolex:Form. Moreover, with regard + to a future etymology OntoLex module, it could serve as a means to + represent data that has been identified and should be pointed to but to + which no further detailed knowledge exists yet but might be added later. + + Required vocabulary: owl:Class + Initial consensus: approved modeling: + + morph:Morph a owl:Class ; rdfs:subClassOf owl:Thing . + + Status Updates: as of 2021, we shifted towards modelling morph:Morphs as + subclasses of ontolex:LexicalEntry. This was done to eliminate redundancy + in morph-level form and sense attributes. + + • + + N2: Specific morph resources + + Description: Next to main morph:Morph class, more specific morph resources + should be representable. For morphological representation, the elements + root and stem should be assignable to classes. Further a morph:Affix class + is required in parallel to ontolex:Affix to enable the representation of + morphs that are not considered ontolex:LexicalEntry resources. Further, + more specific affix types such as transfix (a discontinuous affix), + simulfix (change or replacement of vowels or consonants (usually vowels) + which changes the meaning of a word) and zero morph (a morpheme that has a + morphological meaning that corresponds to no overt form) which are not + covered by other existing RDF vocabularies are required as well. + + Language example: + + English Simulfix: a-->e in man (singular) vs. men (plural) + + Hebrew Transfix:grammatical information is encoded in a discontinuous vowel +pattern that is applied to a consonantic root pattern. E.g. the transfix a-a-a + (third person, singular, past) is inserted into the root k-t-b 'all concepts + evolving around writing' to render the word-form kataba 'he wrote'. + + German Zero Morph: case and gender are not overtly marked in the German noun + Herr 'master' and, thus, correspond to no overt form. The morpheme NOM.SG is +realized by the zero morph Ø (i.e. Herr-Ø (at morph level) vs. ‘master-NOM.SG’ + (at morpheme level)). + + Required vocabulary: owl:Class + + Initial consensus: approved modeling: + + current modeling with fixed set of classes + + mor ph:RootMorph, morph:StemMorph, morph:AffixMorph, morph:TransfixMorph, + morph:SimulfixMorph, morph:ZeroMorph rdfs:subClassOf morph:Morph . + + Status Updates: The need is agreed upon, but as of early 2022, we decided + to move the subclassification of morphs into Lexinfo. This is because this + hierarchy is partially provided in LexInfo v. 3.0, already, and users + should not be confused with having multiple namespaces for information of + the same kind (e.g., lexinfo:Suffix alongside morph:Simulfix). + + • + + N3: Differentiation between derivational and inflectional morph resources + + Description: With regard to representing the morphological content of + lexical data the destinction between word-form forming (inflectional) and + lexeme-forming (derivational) morph:Morph resources should be expressible + and extractable. Concomitantly, the existing limitation of ontolex:Affix + resources to represent only the latter type of morphs (due to its subclass + relation to ontolex:LexicalEntry) will be overcome. + + Language example: + + German (homonym) suffixes: + + 1) -er: an inflectional affix forming comparative adjectives, e.g. schön + 'beautiful' --> schöner 'more beautiful' + + 2) -er: a derivational affix forming agent nouns from verbs, e.g. fahren 'to + drive' --> Fahrer 'driver' + + Required vocabulary: Explicit identification of morph:Morph resources as + being an inflectional or derivational morph. + + Initial consensus: initial modelling: + + morph:Morph morph:hasMorphStatus morph:Value . + + morph:derivational a morph:Value . + + morph:inflectional a morph:Value . + + Status Updates: 2021/2022: The need is agreed upon, but with the inclusion + of data from the LinkingLatin project, we shifted towards class-based + modelling, i.e., WordFormationRule (resp. WordFormationRelation) vs. + InflectionRule. Furthermore, we encode the difference between compounding + and derivation in subclasses of WordFormationRule, resp. (partially) + WordFormationRelation. + + current modelling: + + [a morph:WordFormationRule ] morph:involves [a morph:Morph ]. + + [a morph:CompoundingRule ] morph:involves [a morph:Morph ]. + + [a morph:DerivationRule ] morph:involves [a morph:Morph ]. + + [a morph:WordFormationRelation ] morph:wordFormationRule [ a + morph:DerivationRule; morph:involves [ a morph:Morph ]] . + + Note that here, we don't model the difference as a property of the morph, but + as a property of the analysis and via morph:WordFormationRelation + + • + + N4: Inflectional paradigm + + Description: Lexical data contains pointers to and/or tables of + inflectional paradigms or classes including the respective stem affixes or + the full word-forms. Both, the pointers to paradigms and the + interconnection of word-forms that belong to a paradigm, should be + representable. + + Language example: + + Greek assignment of a lexical entry to an inflection class: λόγος: + + mounce-morphcat: n-2a + +Greek inflectional class paradigm: (with reconstructed underlying stem endings + and desinence) n-3e(3): + + NS: -ευς {-εϝ+ς} + + GS: -εως {-εϝ+ος} + + DS: -ει {-εϝ+ι} + + AS: -εα {-εϝ+α} + + VS: -ευ {-εϝ+} + + NP: -εις {-εϝ+ες} + + VP: -εις {-εϝ+ες} + + GP: -εων {-εϝ+ων} + + DP: -ευσι {-εϝ+σι} + + AP: -εις {-εϝ+ας} + +Examples for inflection tables with the inflectional paradigm structure and the + inflected word-form. Latin: https://en.wiktionary.org/wiki/ + Appendix:Latin_third_conjugation + + German: https://de.wiktionary.org/wiki/Flexion:jagen + + Required vocabulary: + +ontolex:LexicalEntry [object property] [morph:Paradigm] . ontolex:Form [object + property] [morph:Paradigm] . + + Tested on data: + + Status: agreed (version 4.16) + + ontolex:LexicalEntry lexinfo:morphologicalPattern morph:Paradigm . + + ontolex:Form morph:inflectionRule morph:InflectionRule . + + ontolex:InflectionRule morph:hasParadigm morph:Paradigm . + + • + + N5: Morphology crosses part-of-speech boundaries (derivation) + + Description: John (Issue derived from "Linguistic Fundamentals for Natural + Language Processing" by Emily Bender, Source: ) + + Language example: + + Morphological processes can turn one part-of-speech into another, effectively + creating a distinct LexicalEntry + + English + + • + "to play" (verb) => "played" (adjective) + + • + "to play" (verb) => "the playing" (noun) + + Required vocabulary: + + ontolex:LexicalEnty ontolex:lexicalForm ontolex:Form . + + ontolex:Form morph:consistsOf morph:ZeroMorph . + + Tested on data: + + Status: agreed modelling + + CC: This should include "zero derivation", where one word receives another +part-of-speech without any difference in form or meaning. As an example, every +German adjective can be used as adverb, most English prepositions also occur as + subordinating conjunctions (complementizers) and verbal particles, etc. For +"zero morphology", a distinct LexicalEntry is necessary only if differences in +sense can be established. The underlying issue is that OntoLex does not permit + more than one part-of-speech per LexicalEntry (which would be the natural + modeling here). + + Bettina: Derivation should be expressable at least as the underlying + word-formation process. Whether the three different types of derivation (i.e. + 1) zero derivation, 2) word-class changing derivation with no additional + meaning and 3) word-class changing derivation with additional meaning) should + be expressable depends on the needs of the lexicographers. + +Current draft: use established means for derivation to represent conversion and + specify zero morph, e.g. “play” (noun): + + descriptive/extensional modelling: + + ex:play_v_rel_play_n a morph:WordFormationRelation ; + + vartrans:source ex:lex_play_verb ; + + vartrans:target ex:lex_play_noun . + + ex:lex_play_noun ontolex:lexicalForm ex:form_play_noun_sg . + + ex:lex_play_noun rdfs:member|morph:consistsOf ex:lex_play_verb, [a + morph:ZeroMorph ]. + + or generative/intensional modelling: + + ex:play_v_rel_play_n a morph:WordFormationRelation ; + + vartrans:source ex:lex_play_verb ; + + vartrans:target ex:lex_play_noun . + + ex:lex_play_noun ontolex:lexicalForm ex:form_play_noun_sg . + +ex:play_v_rel_play_n morph:wordFormationRule/morph:involves [a morph:ZeroMorph + ]. + + • + + N6: Morphs linked to Lexical Entries + + Description: Many dictionaries contain information about the morphology of + a headword. This is typically given relative to the lemma. A possibility + should be provided that enables an explicit statement of word-forms or + morphemic elements that are given as part of the lexical entry. + + Language example: + + German(from "Langenscheidt Taschenwörterbuch Deutsch als Fremdsprache": + + • + Bedingung die; -, -en + + • + Bedürfnis das; -ses, -se + + • + Beitrag der; -(e)s, Beiträge + + Note the does not cover all forms of the German noun, e.g., "Bedürfnissen", + "Beiträgen" + + It should be possible to model this information with two conditions: + + 1. + It is not necessary to materialize all forms of the word, instead only + the relevant stems and minimal set of inflected forms or inflectional + morphemes + + 2. + It is possible to generate any form in a programmatic manner + + JMC: question is if we can underspecify the morphological pattern + + Required vocabulary: 1. reuse vocabulary for automatic generation of + word-forms and 2. create new property with ontolex:LexicalEntry in its + domain to explicitly state which word-forms and/or morphs or grammatical + information are considered custom extensions of a lemma. + + Tested on data: + Status: unclear if this representation need should be kept + +Look up TEI representation: and https://www.tei-c.org/release/doc/tei-p5-doc/en/html + /DI.html + + Telco 09.06.2021: + +Proposal: Object property morph:morphologicalForm could be created (in parallel +to ontolex:lexicalForm) with domain ontolex:LexicalEntry and range morph:Morph + +→ different positions on whether this should be representable in the module at +all because all information/data is already covered with the vocabulary and it + is a need of space-restricted print dictionaries - discuss again later! + + Status 07.09.2022: This can be done via morph:morphologicalPattern and + morph:paradigm. However, THERE IS NO DIRECT LINK between morph:InflectionRule +and morph:Morph, so this would be represented as string replacements, only, not + as morphs. + + • + + N7: Multiple segmentation strategies + + Description: Way to allow more than 1 segmentation of a single ontolex:Form + + Language example: + + The segmentation of lexical entries or wordforms varies with different + granularity: + + German verb jagte "hunted" + + Complete segmentation: root-stem-suffix + + [[[jag]-t]-e] - [[[root]tense suffix]number suffix]wordform + + Contracted segmentation: stem-suffix + + [[jagt]-e] - [[past tense stem]number suffix]wordform + + Required vocabulary: + + Tested on data: + Status: to be discussed + +Christian: Does occur in Splett's Old High German dictionary (https://brill.com +/view/journals/abag/42/1/article-p264_28.xml): Here, full morphological parses + (tree structures) are being used. The other (main) use case is in language + documentation (with Toolbox, from which dictionaries are being created): +Linguistic glossing can operate on a superficial level or on a deep level, cf. + German fressen ("to eat, of an animal") which superficially involves two + morphemes (fress- + -en), but on a deep level involves three (*ver- + ess- + + -en, *ver- contributing the derogative [non-human] meaning as in verwerfen +"reject", lit. "cast away"). Normally, while one dictionary may chose one level + of depth, another dictionary may chose another. Admitting more than one level + of depth allows to merge information from different sources in a coherent + representation. Wrt. morphological pattern: Isn't the idea that the +morphological pattern describes a context for one given morph(eme)? So if have + more than one (-t- and -e-) here, how will be formalize their combination? + + Petra Steiner (7.9.2022): need for modelling derivation trees ((A B) C) + confirmed. + + Current recommendation: model with decomp, no designated vocabulary needed +HOWEVER: not clear whether this supports multiple concurrent segmentations in a + single data structure. + + • + + N11: Meanings of stems and roots + + Description: Link morphs and senses. For roots or stems with lexical senses + or lexical concepts, e.g., for semantic fields of roots , e.g., + reconstructed protoforms (resp., their meaning) [why is Morph not a Lexical + Entry?] + + Language example: + + The meaning of stems and roots differ in the former are language-specific and + the latter language-independent concepts. Stems have a word-class affiliation + and often also entail grammatical information like tense and number (inherent +inflectional meanings). As they function as the underlying semantic core of the + lexical entry they occur in, the meanings of stems could be treated as the +meanings of lexical entries. Roots, however, comprise very unspecific meanings + from which words of various wordforms can be built. + +Hebrew root k-t-b conveys the concept "anything related to writing". From this + root nouns and verbs can be build, e.g. to write, journalist, author. + + Required vocabulary: + + ` ‘sense’ property + + domain: ontolex:LexicalEntry, morph:StemMorph and morph:RootMorph ` + + range: 'sense' concept class + + Tested on data: + + Initial proposal: modeled as draft: + + Bettina: The description of meanings of stems and roots could be realized in + the same way as the description of meanings of lexical entries as given in + ontolex. For the representation of roots maybe external resources such as + Concepticon could be recommended or the possibility of a plain textual + definition could be established in addition. + + Discussed proposal: Extend domain of ontolex:sense with ontolex:LexicalEntry + and morph:StemMorph and morph:RootMorph. + +JMC: not in favour of extending ontolex:sense domain with morph:Morph, proposes + new property morph:sense with ontolex:LexicalSense and another Concept class. + + JBG: With the use of ontolex:LexicalSense we are assuming an ontological + reference, so we might run into the same problems as the ones we found when + converting dictionaries (which ontological references to point to?). Since in +the lexicog specification we opted to stick to ontolex:LexicalConcepts for the +meaning of lexical entries in the conversion of dict entries to LLD, why would + we want to point to LexicalSense in this case, instead of Concept? + + Current draft: property morph:sense with morph:Morph in domain and + ontolex:LexicalSense in range + + object property: morph:sense + + domain: morph:Morph + + range: ontolex:LexicalSense + + Status: solved: use OntoLex core vocabulary, as morph:Morph is now a + LexicalEntry + + • + + N12: Derivational Meanings + + Description: Issue derived from "Linguistic Fundamentals for Natural + Language Processing" by Emily Bender, Source: + + Language example: + + Diminutives create a new noun with a meaning of being smaller, this could be + modelled by means of adding a small classes to the meaning of a noun. Three + types of derivational meanings should be considered: Conversion: word-class + change with no affxal marking and no additional meaning, e.g. play (v) → play + (n) + +Derivation 1: word-class change with affxal marking and no additional meaning, + e.g. play (v) → playing (n) + + Derivation 2: with or without word-class change with affxal marking and + additional meaning, e.g. book (n) → booklet (n), play (v) → player (n) + + Required vocabulary: class for representing derivational meanings, e.g. + morph:DerivationalConcept + + Tested on data: + Status: modeled as draft: + +Diminuitives are not an ideal example because they are sometimes considered to + be inflectional rather than semantic features (a form of degree, such as + comparative). A better example might be the English morpheme "-er" which + attaches to a verb to form a noun that represents the agent. The classic + representation is by means of a rule: V + "-er" => N_ag (CC) + +John: Model derivational meanings as concepts and link morph instances to this + concept. + + Fahad: Ignore examples with lexicalized words (e.g. computer). We do not need + to model too deeply - just state “diminutive”. + +John: Proposes to have DerivationalConcept as subclass of ontolex:Concept (but + no need for InflectionaConcept subclass). + + Current draft: property morph:evokes with morph:Morph in domain and + ontolex:LexicalConcept in range + + object property: morph:evokes + + domain: morph:Morph + + range: ontolex:LexicalConcept + + morph:DerivationalConcept rdfs:subClassOf ontolex:LexicalConcept . + + Current status: NOT MODELLED: instead of morph:evokes, we can use + ontolex:evokes. TBC: what is the added value of morph:DerivationalConcept + + • + + N13: “missing” part of the stem becomes a separate token + + Description: I think there is a need to allow for morphology to break up a + stem. I see John has raised a similar issue in N9, but what I am suggesting + is that some tokens represent reduced forms of the stem/headword, but that + the “missing” part of the stem becomes a separate token. + + Language example: + + Eg. Old Irish verbs like do-beir: + + 1. Prototonic form is tabair (a verb), with the ta- mapping to the do- of the + stem. 2. Deuterotonic form is do + beir (a particle + a verb). + +In this case, while the headword, do-beir contains do-, the morphological form + does not, and do- exists as a separate particle token. Pronouns can come + between the particle and the verb and this is not considered tmesis. + + Required vocabulary: class for representing free and/or grammatical morphs + and an object property that allows statements to express that a free/ + grammatical morph is part of an ontolex:Form or a complex morph:Morph + resource + + Tested on data: + Status: consensus on modelling: + + object property: morph:consistsOf + + domain: morph:Morph + + range: morph:Morph + + ontolex:Form morph:consistsOf morph:Morph . + + • + + N15: Lexeme generation takes LexicalEntry and Form as input + + Description: The generation of ontolex:LexicalEntry resources should allow + to take resources of the type ontolex:LexicalEntry as well as ontolex:Form + as input sources. This is required for languages which form new lexemes + with inflected word-forms. One example is compounding in German, where the + modifier takes on inflected forms (e.g., Gäste+haus "guest house", lit. + "guests' house" [plural]). + + Language example: + Required vocabulary: morph:consistOf range: ontolex:Form + Tested on data: + initial proposal: modeled as draft: + + The object properties vartrans:source and vartrans:target are reused and the + range of morph:consistOf will not be extended to ontolex:Form. Any word-forms +involved in the source or target of a generated ontolex:LexicalEntry have to be + expressed by using morph:WordFormationRule. + + vartrans:source + + vartrans:target + + morph:WordFormationRule + + current status: to be droppen? no real data. extension of vartrans:source is + possible but beyond scope (in vartrans). We'd need to suggest a + vartrans:LexicalRelation between forms. + + In German linguistics, an alternative view on compounding with inflected +modifiers has been advocated, i.e., that the (diachronic) inflection now serves +as interfix. This is supported by the fact that these "inflections" lost their + grammatical meaning, so there is German Gästehaus (guest house) along with +Gasthaus (restaurant), but the difference in meaning has nothing to do with the + singular or plural morpheme that acts as interfix. + +Appendix: Organizational + +No discussion, just as a reminder, add other organizational things here + +Procedure for Feature freeze + +Not yet, but the following procedure was discussed before: + + • + MI: I think we achieved a level where we freeze everything and write it + up. I think that we can still have limitations unsolved for the final + module, we just need to explicitly decide + + • + Should we vote in mailing list since not everyone is in the call + + • + CC and MI should decide when we’re ready to run the vote, then send to + OntoLex chairs to approve + diff --git a/minutes_txt/09_04_2019.docx.txt b/minutes_txt/09_04_2019.docx.txt index 782f4fb..9817530 100644 --- a/minutes_txt/09_04_2019.docx.txt +++ b/minutes_txt/09_04_2019.docx.txt @@ -28,13 +28,13 @@ E2a: German compound Lungenentzündung ('pneumonia' literally 'lung inflammation'): full segmentation including 4 subterms and 7 Component constituents: -[image4] +[image3] E2b: German compound Lungenentzündung ('pneumonia' literally 'lung inflammation'): binary segmentation involving 8 Component constituents (subterm relations could be added) -[image3] +[image4] Components apply to the character string in the Lexical Entry diff --git a/minutes_txt/12_07_2023.docx.txt b/minutes_txt/12_07_2023.docx.txt new file mode 100644 index 0000000..0dc8fc5 --- /dev/null +++ b/minutes_txt/12_07_2023.docx.txt @@ -0,0 +1,364 @@ +Link: https://meet.google.com/nsj-tbcy-yop [CHECK HERE FOR UPDATED LINK(S)] + +Latest Definitions: https://github.com/ontolex/morph/blob/master/draft.md + +Nexus: https://nexuslinguarum.eu/the-action/join-us + +Participants [please add yourself]: + +Christian Chiarcos (CC) (excused for being 10 min late) + +Max Ionov (MI) + +Katerina Gkirtzou (KG) + +Besim Kabashi (BK) + +Fahad Khan (FK) + +Khadija Ait ElFqih (KAE) + +Matteo Pellegrini (MP) + +Ciprian-Octavian Truică (CT) + +Penny Labropoulou (PL) + +Elena Simona Apostol (ESA) + +Sina Ahmadi (SA) + +Elena Benzoni (EB) + +Petra Steiner (PS) + +Theodorus Fransen (TF) + +Ranka Stanković (RS) + +Gilles Sérasset (GS) + +Mike Rosner (MR) + +Table of Contents + +0. Module draft (4.17) + +1. Datathon results + +2. Clitics + +3. Character/sound classes + + Maltese example + + Defining classes + + Other examples + +4. W3C day presentation + +Next time + +0. Module draft (4.17) + +[image1] + +1. Datathon results + +Two projects that successfully applied Morph for modelling derivation: + + 1. + PIE etymological root database (+ Old Irish, but no Morph was needed + there) + + 2. + Aspectual database for Serbian, Croatian and Bosnian + +Seemed to work really well, no issues there + +TODO@MI, CC, KG: Add links to project results + +2. Clitics + +See minutes from the last call, seemed to be a consensus on modelling them as +wordforms with a space. + +Still some todos: + +TODO@MI: add a pull request with an example of this to the documentation + +TODO: check what people think about cliticization + +3. Character/sound classes + +Brief reminder: MI and MR argue for adding an element for representing a group +of characters (or sounds) to use in replacement rules. This can be helpful for +many languages in a lot of situations, making rules reusable and +understandable. + +Maltese example + +kiteb → ktibt (PERF.1SG) + +With character/sound classes: + + a morph:InflectionRule ; + +morph:paradigm ; + +morph:involves ; + +morph:replacement [ + +a morph:Replacement ; + +morph:source "(C)(V)(C)(V)(C)" ; + +morph:target "\1\3i\5t" ; + + morph:replacementClass [ rdfs:label "V" ; + + rdfs:value "[aeiou]", "e", "i", "o", "u" . ], + + [] + +] . + +Modification Christian: + + a morph:InflectionRule ; + + morph:paradigm ; + + morph:involves ; + + morph:replacement [ + + a morph:Replacement ; + + morph:source "(C)(V)(C)(V)(C)" ; + + morph:target "\1\3i\5t" ; + + morph:replacementTable :r1 ] . + + :r1 a morph:ReplacementTable ; + + morph:replacements “{ ‘V’ : ‘[aeiou]’, ‘C’: ‘[bcdghjklmnpqrstvwxz]’ }“ . + +Without character/sound classes: + + a morph:InflectionRule ; + +morph:paradigm ; + +morph:involves ; + +morph:replacement [ + +a morph:Replacement ; + +morph:source "(ċ|d|n|r|s|t|x|ż|z|b|f|ġ|g|għ|h|ħ|j|k|l|m|p|q|v|w)(a|e|i|o|u|ie) +(ċ|d|n|r|s|t|x|ż|z|b|f|ġ|g|għ|h|ħ|j|k|l|m|p|q|v|w)(a|e|i|o|u|ie)(ċ|d|n|r|s|t|x| +ż|z|b|f|ġ|g|għ|h|ħ|j|k|l|m|p|q|v|w)" ; + +morph:target "\1\3i\5t" ; + +] . + +Technically similar albeit bulky, visually much less clear — difficult to +interpret or to see differences between different rules. + +Defining classes + +Model change: adding one class + + a ; # this is not a part of the model, but a part of +the dataset + +rdfs:label "V" ; + +rdfs:member "a", "e", "i", "o", "u" . + +For form generation: adding a replace for each class. + +If decided not to use this in the model this will probably be used ad-hoc → +rules in the datasets will be not interoperable. + +Other examples + + • + Vowel harmony rules in Turkic and Finno-Ugric languages (e.g. Turkish, + Finnish) + + □ + Vowels in the replacement depend on vowels in the root + + □ + An affix can be -lla or -llä for a Finnish case + + morph:replacement [ + +a morph:Replacement ; + +morph:source "(.*FRONT_V.*)$" ; + +morph:target "\1llä" ; + +], + + a morph:Replacement ; + + morph:source “(.*BACK_V.*)$” ; + + morph:target “\1lla” ; + + ] . # Finnish Adessive case + + • + Rules like “If the stem of the noun ends in a vowel, the buffer + consonant y is added” + + morph:replacement [ + +a morph:Replacement ; + +morph:source "(V)$" ; + +morph:target "\1ya" ; + +], + + a morph:Replacement ; + + morph:source “(C)$” ; + + morph:target “\1a” ; + + ] . # Turkish Accusative and Dative + + • + Umlaut mutations in German (provided by CC, motivated by historical + process description) + + rule:umlaut a morph:WordFormationRule; + + morph:replacement + + [ a morph:Replacement; + + morph:source "a([^aeiouöü ]*)$"; + + morph:target "ä\1" ]; + + [ a morph:Replacement; + + morph:source "o([^aeiouöü ]*)$"; + + morph:target "ö\1" ]; + + [ a morph:Replacement; + + morph:source "u([^aeiouöü ]*)$"; + + morph:target "ü\1" ] . + +vs. + + rule:umlaut a morph:WordFormationRule; + + morph:replacement + + [ a morph:Replacement; + + morph:source "a(C*)$"; + + morph:target "ä\1" ]; + + [ a morph:Replacement; + + morph:source "o(C*)$"; + + morph:target "ö\1" ]; + + [ a morph:Replacement; + + morph:source "u(C*)$"; + + morph:target "ü\1" ] . + +Note: in all these cases we will apply two alternative regexes in the same rule + +Pros and cons of extending the model: + +Pro: rules are more readable but at the same time interoperable + +Pro: easier conversion from resources providing rules + +Con: these properties are not connected to any other elements in the model + +Con: there is no morphological meaning behind this class (but it won’t be the +first time) + +4. W3C day presentation + + • + https://www.w3.org/community/ontolex/wiki/W3c_community_day_@_LDK2023 + + • + Last time we presented the status of the model back then. Should we + present the way it is again, focusing on the latest additions and + changes? + + • + ~40 minutes, but how long should be the discussion? Last time there was + not that many questions + + • + Usually people who care are in our calls + + • + But: this time we can also mention that the model works for Semitic + languages + +FK: Be clear about the state of the model: are we happy, if there still +something we are unhappy with + + • + Also, choose if we are going to say if we still have some time to + implement new things or just say if it is what it is + +CC: We can’t come up with a list of requirements right now because we still +have open questions + + • + TODO@MI: go through all the requirements and give an overview of what + is left + + • + TODO@CC, MI: discuss what is left + +Chat + +Next time + + • + Probably the last call before September. + + • + Going through the requirements? + + • + Looking at one last topic / dataset before going on vacation? Anything + urgent? + + • + Paradigm discussion? + + • + Finite state terminology? + + • + Open questions from LDL? + diff --git a/minutes_txt/13_07_2022.docx.txt b/minutes_txt/13_07_2022.docx.txt index 1ecedb0..9bae987 100644 --- a/minutes_txt/13_07_2022.docx.txt +++ b/minutes_txt/13_07_2022.docx.txt @@ -458,23 +458,23 @@ equivalent to • Comparing alternatives: - □ current model +current model - 1. - Form -inflectionType-> InflectionType + 1. + Form -inflectionType-> InflectionType - 2. - Paradigm <-paradigm- InflectionType + 2. + Paradigm <-paradigm- InflectionType - 3. - InflectionType -inflectionRule-> InflectionRule + 3. + InflectionType -inflectionRule-> InflectionRule - 4. - InflectionType -next-> InflectionType + 4. + InflectionType -next-> InflectionType - □ - alternative 0: keep current model, one inflection type per paradigm - and rule + • + alternative 0: keep current model, one inflection type per paradigm and + rule pro: backward-compatible @@ -483,19 +483,19 @@ equivalent to con: still contradicts current definition - • alternative 1: detach InflectionType +alternative 1: detach InflectionType - 1. - Form -inflectionRule-> InflectionRule + 1. + Form -inflectionRule-> InflectionRule - 2. - Paradigm <-paradigm- InflectionRule + 2. + Paradigm <-paradigm- InflectionRule - 3. - InflectionRule -inflectionType-> InflectionType + 3. + InflectionRule -inflectionType-> InflectionType - 4. - InflectionType -next-> InflectionType + 4. + InflectionType -next-> InflectionType pro: we basically keep all the information we have, incl. finite state modelling and agglutination @@ -510,19 +510,19 @@ equivalent to traditional usage of “paradigm”. in inflection tables, it normally includes allomorphic variants. - • alternative 2: replace InflectionType by GrammaticalMeaning +alternative 2: replace InflectionType by GrammaticalMeaning - a. - Form -inflectionRule-> InflectionRule + a. + Form -inflectionRule-> InflectionRule - b. - Paradigm <-paradigm- InflectionRule + b. + Paradigm <-paradigm- InflectionRule - c. - InflectionRule -grammaticalMeaning-> GrammaticalMeaning + c. + InflectionRule -grammaticalMeaning-> GrammaticalMeaning - d. - GrammaticalMeaning -next-> GrammaticalMeaning + d. + GrammaticalMeaning -next-> GrammaticalMeaning pro: we basically keep all the information we have, incl. finite state modelling and agglutination @@ -538,19 +538,19 @@ equivalent to con: for FST, this is very opaque, a better name? => we could introduce a designated subclass “FiniteState” of GrammaticalMeaning !? - • alternative 3: merge InflectionType with InflectionRule +alternative 3: merge InflectionType with InflectionRule - a. - Form -inflectionRule-> InflectionRule + a. + Form -inflectionRule-> InflectionRule - b. - Paradigm <-paradigm- InflectionRule + b. + Paradigm <-paradigm- InflectionRule - c. - InflectionRule -grammaticalMeaning-> GrammaticalMeaning + c. + InflectionRule -grammaticalMeaning-> GrammaticalMeaning - d. - InflectionRule -next-> InflectionRule + d. + InflectionRule -next-> InflectionRule pro: we keep all the information we have, incl. finite state modelling and agglutination diff --git a/minutes_txt/15_11_2023.docx.txt b/minutes_txt/15_11_2023.docx.txt new file mode 100644 index 0000000..9d97c16 --- /dev/null +++ b/minutes_txt/15_11_2023.docx.txt @@ -0,0 +1,410 @@ +Link: https://meet.google.com/nsj-tbcy-yop [CHECK HERE FOR UPDATED LINK(S)] + +Latest Definitions: https://github.com/ontolex/morph/blob/master/draft.md + +Nexus: https://nexuslinguarum.eu/the-action/join-us + +Participants [please add yourself]: + +Christian Chiarcos (CC) + +Max Ionov (MI) + +Katerina Gkirtzou (KG) + +Besim Kabashi (BK) + +Fahad Khan (FK) + +Khadija Ait ElFqih (KAE) + +Matteo Pellegrini (MP) + +Ciprian-Octavian Truică (CT) + +Penny Labropoulou (PL) + +Elena Simona Apostol (ESA) + +Sina Ahmadi (SA) + +Elena Benzoni (EB) + +Petra Steiner (PS) + +Theodorus Fransen (TF) + +Ranka Stanković (RS) + +Gilles Sérasset (GS) + +Mike Rosner (MR) + +Table of Contents + +0. Module draft (4.18) + +1. CC: Inuktikut data and problems with LexicalEntries + +2. Grammatical meaning connection for InflectionType + +3. Next call + +0. Module draft (4.18) + +[image1] + + • + Removed ??? from morph:grammaticalMeaning. We can support this property + for morph:InflectonType (we might discuss this today if we have time) + + • + Made the round arrow pretty :) + +1. CC: Inuktikut data and problems with LexicalEntries + +https://github.com/acoli-repo/morph-addenda/tree/master/data/polysynthetic + +The problem: ambiguity in word formation — https://github.com/acoli-repo/ +morph-addenda/tree/master/data/polysynthetic#5-encode-ambiguity-in-derivation + +Discussion: + + • + Example atausiulugu + + □ + This is an inflected word, but the inflection part is + unproblematic. The word formation part is challenging, and that + part would also be relevant for a dictionary. + + • + Can a morph:Morph have more than one form? + + □ + CC: inuktitut morph. Segmentation provides canonical form of affix + along with current string realization. How to model that? + (proposal: multiple forms of the same morph) + + □ + If yes, this could be used to represent (some cases of) allomorphy + + □ + MI: original consensus to not require morph:Morphs to be morphemes, + morpheme handling within Mmoon + + • + Allomorphy + + □ + MI: one morph:Morph per form, relate allomorphs indirectly by + repurposing grammatical meaning (i.e., one Morpheme=one grammatical + meaning) + + □ + CC: redundancy and information loss, incompatible with otherForms + for other lexical entries. + + □ + MI: if morphophonological, then just provide rules instead of + segments + + □ + CC: we cannot require everyone who wants to give a morphological + segmentation to write a morphological analyzer + + • + CC: Instead of forms being a sequence of morphs, model them as a + sequence of forms (example below), otherwise we lose information about + the realization of a morph(eme) at a particular position + + □ + If allomorphs are modelled as alternative forms of the same + underlying morph(eme), this allows us to provide information about + the specific form variant of a morph used as a particular position + in a particular word without having to write full-fledged + assimilation rules + + □ + This means to allows one ontolex:Form to be a rdfs:Seq of + ontolex:Forms (rather morph:Morphs). No changes to morph:contains + + □ + MI: leads to problems with downward compatibility + + □ + CC: if we don’t arrive at a consensus here, we could skip the + rdfs:Seq part of forms, because this is nothing defined in morph + namespace anyway + + • + MI: Next meeting supposed to be the last regular morph meeting + + □ + FK: really? MI: There may be irregular meetings + +POST-CALL ADDENDA: Excerpt from https://github.com/acoli-repo/morph-addenda/ +tree/master/data/polysynthetic (apologies for not being able to find them on +the spot) + +Original analysis: + +{atausi:atausiq/1n}{u:u/1nv}{lugu:lugu/tv-part-1s-3s-fut} {one}{existence; is} +{part. future: while I \...him/her/it} + +As IGT (with morpheme boundaries added, these are indirectly expressed from the +fourth row with grammaticalMeaning/baseConstraints) + + atausiulugu +atausi- -u- -lugu +atausiq- -u- -lugu + 1n 1nv tv-part-1s-3s-fut + one existence; Is part. future: while I \...him/her/it + +Original dictionary: + +{atausi:atausiq/1n} {one} + +{u:u/1nv} {existence; is} + +{lugu:lugu/tv-part-1s-3s-fut} {part. future: while I \...him/her/it} + +In OntoLex-Morph: + +# CURRENT Seq with morphs + +# FAIL 1: we loose information about the actual forms + +:atausiulugu_le a ontolex:Word; + +ontolex:canonicalForm :atausiulugu_atausiulugu_f. + +:atausiulugu_atausiulugu_f a ontolex:Form; + +ontolex:writtenRep "atausiulugu"@iu-Latn. + +:atausiulugu_atausiulugu_f a rdfs:Seq; + +rdf:_1 :atausiq_1_le; + +rdf:_2 :u_1_le; + +rdf:_3 :lugu_tv_part_le. + +# REVISION Seq with forms + +# FAIL 1: SOLVED + +formation rules, see data) + +:atausiulugu_le a ontolex:Word; + +ontolex:canonicalForm :atausiulugu_atausiulugu_f. + +:atausiulugu_atausiulugu_f a ontolex:Form; + +ontolex:writtenRep "atausiulugu"@iu-Latn. + +:atausiulugu_atausiulugu_f a rdfs:Seq; + +rdf:_1 :atausiq_atausi_f; + +rdf:_2 :u_u_f; + +rdf:_3 :lugu_lugu_f. + +Affix/base dictionary: + +:atausiq_1_le a lexinfo:RootMorph, ontolex:LexicalEntry; + +ontolex:canonicalForm :atausiq_atausiq_f; + +ontolex:otherForm :atausiq_atausi_f; + +ontolex:sense :atausiq_1n; + +lexinfo:partOfSpeech lexinfo:noun. # SAME: "type: nominal root" + +:atausiq_atausiq_f a ontolex:Form; + +ontolex:writtenRep "atausiq"@iu-Latn, "ᐊᑕᐅᓯᖅ"@iu-Cans. + +:atausiq_atausi_f a ontolex:Form; + +ontolex:writtenRep "atausi"@iu-Latn, "ᐊᑕᐅᓯ"@iu-Cans. + +:atausiq_1n a ontolex:LexicalSense; + +skos:definition "one"@en; # SAME + +ontolex:concept :number_quantity. # NEW + +# this shows allomorphy for a root morph, it’s the same for affixes, though + +# {u:u/1nv} {existence; is} + +:u_1_le a ontolex:Affix; + +ontolex:canonicalForm :u_u_f; + +# we have more than one u-form, so the ids have to be more specific, + +# as these forms differ in their phonological context + +ontolex:sense :u_1nv; + +morph:grammaticalMeaning :verb; # ../1n*v* + +morph:baseConstraint :noun. # ../1*n*v + +:u_1nv a ontolex:LexicalSense; + +skos:definition "existence; is"@en. + +:u_u_f a ontolex:Form; + +ontolex:writtenRep "u". + +# -lugu is somewhat ambiguous: + +# {lugu:lugu/tv-part-1d-3s-fut} {part. future: while we (two) \...him/her/it} + +# {lugu:lugu/tv-part-1p-3s-fut} {part. future: while we (many) \...him/her/it} + +# {lugu:lugu/tv-part-1s-3s-fut} {part. future: while I \...him/her/it} + +# {lugu:lugu/tv-part-2d-3s-fut} {part. future: while you (two) \...him/her/it} + +# {lugu:lugu/tv-part-2p-3s-fut} {part. future: while you (many) \...him/her/it} + +# {lugu:lugu/tv-part-2s-3s-fut} {part. future: while you \...him/her/it} + +# {lugu:lugu/tv-part-4d-3s-fut} {part. future: while they (two) \...him/her/it} + +# {lugu:lugu/tv-part-4p-3s-fut} {part. future: while they (many) \...him/her/ +it} + +# {lugu:lugu/tv-part-4s-3s-fut} {part. future: while he/she/it \...him/her/it} + +:lugu_tv_le a ontolex:Affix; + +ontolex:canonicalForm :lugu_lugu_f; + +ontolex:sense :lugu_tv_part_fut; + +ontolex:baseConstraint :verb; # we're doing verbal inflection here, so we can +attach to a verbal base, only + +ontolex:grammaticalMeaning + +:tv_1d_3s, :tv_1p_3s, :tv_1s_3s, # these are alternative meanings + +:tv_2d_3s, :tv_2p_3s, :tv_2s_3s, + +:tv_4d_3s, :tv_4p_3s, :tv_4s_3s. + +# Note: In this way, we cannot disambiguate forms for their different +grammatical meanings + +# If that would be intended, we would need to create one lexical entry per +feature combination. + +lexinfo:mood :verbal_participle; + +lexinfo:tense :future. + +# Note: Inuktitut does not inflect for grammatical tense, but only for mood. +Some moods have future readings, though. + +:lugu_tv_part_fut a ontolex:LexicalSense; + +skos:definition "part. future: while s.o. does something to s.t. (object)"@en. + +:lugu_lugu_f a ontolex:Form; + +ontolex:writtenRep "lugu"@iu-Latn. + +Note on incorporating verbs like :u_1_le + + • + Morphologically, these behave like affixes, but semantically, they are + lexical verbs. These really need to be lexical entries. + +Note on allomorphy: + + • + Except for the stem atausi-, all morphemes in this example happen to + take the canonical form, but they have other allomorphic variants. By + allowing ontolex:otherForm for the these variants, adding a new variant + requires two triples (ontolex:otherForm and ontolex:writtenRep – the + others can be inferred). Creating a separate lexical entry for atausi + means that it and its sense have to be completely duplicated. Also, + there is no direct link between the allomorphic variants. Same for -ut- + and its variant -u-, for -uq- and its variant -u-, for -liq- and its + variants -siq-, -si- and -li-, etc. (depending on following morpheme, + final consonants can be assimilated. But this is described with the + following morpheme, see https://github.com/acoli-repo/morph-addenda/ + blob/master/data/polysynthetic/atausiulugu.morphs.ttl). + + • + As for how that would be presented in a dictionary, see https:// + www.inuktitutcomputing.ca/DataBase/index.php?lang=en&c= + DefinitionDeSuffixe&m=liq%2F2nv or https://uqausiit.ca/sites/default/ + files/2020-04/Affix-Dictionary-V21.pdf. + +Post-call addendum + + • + As requested by Max, a comparison of the modelling with forms being + rdfs:Seq of forms resp. Morphs under https://github.com/acoli-repo/ + morph-addenda/tree/master/data/polysynthetic + + □ + directory one-morph-with-multiple-forms/ + + ☆ + incl. linking of contextual variant with canonical form (via + otherForm) + + ☆ + 260 triples (all data) + + ☆ + Morph dictionary (atausiulugu.morphs.ttl): 202 triples + + □ + directory every-form-one-morph/ + + ☆ + Information loss: no linking between different form variants + yet + + ☆ + 312 triples (all data, +20%) + + ☆ + Morph dictionary (atausiulugu.morphs.ttl): 254 triples (+25%) + + □ + +2. Grammatical meaning connection for InflectionType + +What is InflectionType right now? https://github.com/ontolex/morph/issues/11 +Option 2? + +3. Next call + +29.11.2023 + +Agenda: + + • + Confirm that all the datasets are compatible + + • + Go through all the GH issues + + • + A quick look at the companion vocabulary (for regex rules and sound + classes) + diff --git a/minutes_txt/17_05_2023.docx.txt b/minutes_txt/17_05_2023.docx.txt new file mode 100644 index 0000000..5a8beb2 --- /dev/null +++ b/minutes_txt/17_05_2023.docx.txt @@ -0,0 +1,343 @@ +Link: https://meet.google.com/nsj-tbcy-yop [CHECK HERE FOR UPDATED LINK(S)] + +Latest Definitions: https://github.com/ontolex/morph/blob/master/draft.md + +Nexus: https://nexuslinguarum.eu/the-action/join-us + +Participants [please add yourself]: + +Christian Chiarcos (CC) (excused for being 10 min late) + +Max Ionov (MI) + +Katerina Gkirtzou (KG) + +Besim Kabashi (BK) + +Fahad Khan (FK) + +Khadija Ait ElFqih (KAE) + +Matteo Pellegrini (MP) + +Ciprian-Octavian Truică (CT) + +Penny Labropoulou (PL) + +Elena Simona Apostol (ESA) + +Sina Ahmadi (SA) + +Elena Benzoni (EB) + +Petra Steiner (PS) + +Theodorus Fransen (TF) + +Thierry Declerck (DFKI) + +Ranka Stanković (RS) + +Gilles Sérasset (GS) + +Mike Rosner (MR) + +Table of Contents + +0. Module draft (4.17) + +1. Old Irish + +2. Clitics + +3. Character/sound classes + + Maltese example + + Defining classes + + Other examples + +Next time + +0. Module draft (4.17) + +[image1] + +1. Old Irish + + • + [short data presentation], mainly based on spreadsheet + + • + Data currently on hand is expressible in Ontolex-core + Ontolex-morph + + □ + Qs TF: + + ☆ + Morph vs word, compare UD + + ○ + Interlinking/interoperability issues? + + ☆ + Can/should we segment a stem into morphs? + + ○ + MI: Should we talk about stems as Morphs? Does it mean that + any morph is subdividable? Technically, it is allowed since + they are all subclasses of a LexicalEntry which can + consists of Morphs + + • + Generation rules (when they are there) are too complex to handle in + Ontolex-morph + + • + Can be expressed with FSTs + + □ + FST functionality within next modules? + + □ + Should FST be a part of Ontolex or a completely separate + vocabulary? What are the use-case for having it as a part of + Ontolex? + + □ + If as an OntoLex module: + + ☆ + Together with diachrony? + + ☆ + Together with sound classes? + +2. Clitics + +Spanish reflexive clitic se: + + • + Llamar — to call, llamarse — be named + + □ + No problems here, a separate lexical entry + + • + When conjugated, clitic goes to the left and separated + orthographically: ¿Cómo te llamas? — what is your name? + + □ + Still, it is one phonetic word (e.g. single stress) + +:llamarse a ontolex:LexicalEntry ; + +ontolex:canonicalForm :llamarse_form_inf ; + +ontolex:otherForm :llamarse_form_2sg . + +:llamarse_form_inf a ontolex:Form ; + +ontolex:writtenRep “llamarse” ; + +lexinfo:... . + +:llamarse_form_2sg a ontolex:Form ; + +ontolex:writtenRep “te llamas” ; + +lexinfo:number lexinfo:singular ; + +lexinfo:person lexinfo:second ; + +… . + +No problems for generation as well (rules) as well: + + a morph:InflectionRule ; + +morph:paradigm ; + +morph:involves , ; + +morph:replacement [ + +a morph:Replacement ; + +morph:source "^(.*)arse$" ; + +morph:target "te \1as" ; + +] . + +Inflection: + +:llamar a ontolex:LexicalEntry ; + +ontolex:canonicalForm :llamar_form_inf ; + +ontolex:otherForm :llamarse_form_2sg . + +:llamar_form_inf a ontolex:Form ; + +ontolex:writtenRep “llamar” ; + +lexinfo:... . + +:llamar_form_2sg a ontolex:Form ; + +... + +:llamarse_form_2sg a ontolex:Form ; + +ontolex:writtenRep “te llamas” ; + +lexinfo:number lexinfo:singular ; + +lexinfo:person lexinfo:second ; + +lexinfo:reflexivity?? lefinfo:reflexive?? + +… . + +No problems for generation as well (rules) as well: + + a morph:InflectionRule ; + +morph:paradigm ; + +morph:involves , ; + +morph:replacement [ + +a morph:Replacement ; + +morph:source "^(.*)arse$" ; + +morph:target "te \1as" ; + +] . + +TODO: add a pull request with an example of this to the documentation + +TODO: check what people think about cliticization + +3. Character/sound classes + +Maltese example + +kiteb → ktibt (PERF.1SG) + +With character/sound classes: + + a morph:InflectionRule ; + +morph:paradigm ; + +morph:involves ; + +morph:replacement [ + +a morph:Replacement ; + +morph:source "(C)(V)(C)(V)(C)" ; + +morph:target "\1\3i\5t" ; + +] . + +Without character/sound classes: + + a morph:InflectionRule ; + +morph:paradigm ; + +morph:involves ; + +morph:replacement [ + +a morph:Replacement ; + +morph:source "(ċ|d|n|r|s|t|x|ż|z|b|f|ġ|g|għ|h|ħ|j|k|l|m|p|q|v|w)(a|e|i|o|u|ie) +(ċ|d|n|r|s|t|x|ż|z|b|f|ġ|g|għ|h|ħ|j|k|l|m|p|q|v|w)(a|e|i|o|u|ie)(ċ|d|n|r|s|t|x| +ż|z|b|f|ġ|g|għ|h|ħ|j|k|l|m|p|q|v|w)" ; + +morph:target "\1\3i\5t" ; + +] . + +Technically similar albeit bulky, visually much less clear — difficult to +interpret or to see differences between different rules. + +Defining classes + +Model change: adding one class + + a ; + +rdfs:label "V" ; + +rdfs:member "a", "e", "i", "o", "u" . + +For form generation: adding a replace for each class. + +If decided not to use this in the model this will probably be used ad-hoc → +rules in the datasets will be not interoperable. + +Other examples + + • + Vowel harmony rules in Turkic and Finno-Ugric languages (e.g. Turkish, + Finnish) + + □ + Vowels in the replacement depend on vowels in the root + + □ + An affix can be -lla or -llä for a Finnish case + + morph:replacement [ + +a morph:Replacement ; + +morph:source "(.*FRONT_V.*)$" ; + +morph:target "\1llä" ; + +], + + a morph:Replacement ; + + morph:source “(.*BACK_V.*)$” ; + + morph:target “\1lla” ; + + ] . # Finnish Adessive case + + • + Rules like “If the stem of the noun ends in a vowel, the buffer + consonant y is added” + + morph:replacement [ + +a morph:Replacement ; + +morph:source "(V)$" ; + +morph:target "\1ya" ; + +], + + a morph:Replacement ; + + morph:source “(C)$” ; + + morph:target “\1a” ; + + ] . # Turkish Accusative and Dative + +Note: in both cases we will apply two alternative regexes in the same rule + +Next time + diff --git a/minutes_txt/18_05_2022.docx.txt b/minutes_txt/18_05_2022.docx.txt index c5d4d64..cc55de9 100644 --- a/minutes_txt/18_05_2022.docx.txt +++ b/minutes_txt/18_05_2022.docx.txt @@ -384,23 +384,23 @@ Model draft 4.15 updates (to be discussed) & open issues: □ TODO@Katerina+Penny: example fusional - ☆ current model +current model - 1. - Form -inflectionType-> InflectionType + 1. + Form -inflectionType-> InflectionType - 2. - Paradigm <-paradigm- InflectionType + 2. + Paradigm <-paradigm- InflectionType - 3. - InflectionType -inflectionRule-> InflectionRule + 3. + InflectionType -inflectionRule-> InflectionRule - 4. - InflectionType -next-> InflectionType + 4. + InflectionType -next-> InflectionType - ☆ - alternative 0: keep current model, one inflection type per - paradigm and rule + • + alternative 0: keep current model, one inflection type per paradigm and + rule pro: backward-compatible @@ -579,19 +579,19 @@ lexinfo:number lexinfo:singular ; lexinfo:case lexinfo:genitive . - • alternative 1: detach InflectionType +alternative 1: detach InflectionType - 1. - Form -inflectionRule-> InflectionRule + 1. + Form -inflectionRule-> InflectionRule - 2. - Paradigm <-paradigm- InflectionRule + 2. + Paradigm <-paradigm- InflectionRule - 3. - InflectionRule -inflectionType-> InflectionType + 3. + InflectionRule -inflectionType-> InflectionType - 4. - InflectionType -next-> InflectionType + 4. + InflectionType -next-> InflectionType pro: we basically keep all the information we have, incl. finite state modelling and agglutination @@ -754,7 +754,7 @@ lexinfo:number lexinfo:singular ; lexinfo:case lexinfo:genitive . - • alternative 2: +alternative 2: replace InflectionType by GrammaticalMeaning @@ -789,19 +789,19 @@ Example for fusional language Notes: for fusional languages, this is the same as for alternative 1, since the inflection type is not used. - • alternative 3: merge InflectionType with InflectionRule +alternative 3: merge InflectionType with InflectionRule - a. - Form -inflectionRule-> InflectionRule + a. + Form -inflectionRule-> InflectionRule - b. - Paradigm <-paradigm- InflectionRule + b. + Paradigm <-paradigm- InflectionRule - c. - InflectionRule -grammaticalMeaning-> GrammaticalMeaning + c. + InflectionRule -grammaticalMeaning-> GrammaticalMeaning - d. - InflectionRule -next-> InflectionRule + d. + InflectionRule -next-> InflectionRule pro: we keep all the information we have, incl. finite state modelling and agglutination diff --git a/minutes_txt/18_10_2023.docx.txt b/minutes_txt/18_10_2023.docx.txt new file mode 100644 index 0000000..ff0e419 --- /dev/null +++ b/minutes_txt/18_10_2023.docx.txt @@ -0,0 +1,95 @@ +Link: https://meet.google.com/nsj-tbcy-yop [CHECK HERE FOR UPDATED LINK(S)] + +Latest Definitions: https://github.com/ontolex/morph/blob/master/draft.md + +Nexus: https://nexuslinguarum.eu/the-action/join-us + +Participants [please add yourself]: + +Christian Chiarcos (CC) [late] + +Max Ionov (MI) + +Katerina Gkirtzou (KG) + +Besim Kabashi (BK) + +Fahad Khan (FK) + +Khadija Ait ElFqih (KAE) + +Matteo Pellegrini (MP) + +Ciprian-Octavian Truică (CT) + +Penny Labropoulou (PL) + +Elena Simona Apostol (ESA) + +Sina Ahmadi (SA) + +Elena Benzoni (EB) + +Petra Steiner (PS) + +Theodorus Fransen (TF) + +Ranka Stanković (RS) + +Gilles Sérasset (GS) + +Mike Rosner (MR) + +Table of Contents + +0. Module draft (4.17) + +1. Quick vote: Replacement → RegexReplacement + +2. MP: Another standard for morphology representation + +3. CC: Inuktikut data and problems with LexicalEntries + +0. Module draft (4.17) + +[image1] + +1. MP presents Paralex: Another standard for morphology representation + +MP presents the new standard for morphology, its compatibility with Morph. + +https://www.paralex-standard.org/standard/ + +Based on Unimorph with the hope to overcome Unimorph limitations + +Complementation with Morph: + + • + This is much more specific than Morph + + • + The idea is to be able to automatically convert this to Morph + + • + Paralex makes it interoperable with UD-family resources + +2. Quick vote: Replacement → RegexReplacement + +Suggestion — having morph:Replacement underdefined, a new subclass +morph:RegexReplacement which has the properties morph:Replacement has now. + +Conclusion: removing morph:source and morph:target. Moving RegexReplacement as +a subclass to a companion vocabulary + +NOTE: in the guidelines, we can recommend using rdf:value for string +representations of replacements, eg as in Perl or Sed, s/SOURCE/TARGET + +3. CC: Inuktikut data and problems with LexicalEntries + +https://github.com/acoli-repo/morph-addenda/tree/master/data/polysynthetic + +The problem: ambiguity in derivation — https://github.com/acoli-repo/ +morph-addenda/tree/master/data/polysynthetic#5-encode-ambiguity-in-derivation + +CC will explain next time (01.11.2023) + diff --git a/minutes_txt/19_04_2023.docx.txt b/minutes_txt/19_04_2023.docx.txt new file mode 100644 index 0000000..848b19e --- /dev/null +++ b/minutes_txt/19_04_2023.docx.txt @@ -0,0 +1,428 @@ +Link: https://meet.google.com/nsj-tbcy-yop + +Latest Definitions: https://github.com/ontolex/morph/blob/master/draft.md + +Nexus: https://nexuslinguarum.eu/the-action/join-us + +Participants [please add yourself]: + +Christian Chiarcos (CC) (20 min late) + +Max Ionov (MI) + +Katerina Gkirtzou (KG) + +Besim Kabashi (BK) + +Fahad Khan (FK) + +Khadija Ait ElFqih (KAE) + +Matteo Pellegrini (MP) + +Ciprian-Octavian Truică (CT) + +Penny Labropoulou (PL) + +Elena Simona Apostol (ESA) + +Sina Ahmadi (SA) + +Elena Benzoni (EB) + +Petra Steiner (PS) + +Theodorus Fransen (TF) + +Thierry Declerck (DFKI) + +Ranka Stanković (RS) + +Gilles Sérasset (GS) + +Mike Rosner (MR) + +Table of Contents + +0. Module draft (4.17) + +1. Semitic Roots + +2. morph:baseConstraint + +0. Module draft (4.17) + +[image1] + +1. Semitic Roots + +MI: + + • + Suggestion: Roots are morphs (lexinfo:RootMorph), canonical forms + specified for a lexical entry consist of them. + + • + Approved by participants + +Note by CC: lexinfo:RootMorph is suggested for addition to Lexinfo (https:// +github.com/ontolex/lexinfo/issues/21), then, it entails ontolex:LexicalEntry +[that was also in one of our papers and the reason to abandon the morph +taxonomy more than one year ago] + +:k-t-b a lexinfo:RootMorph, ontolex:LexicalEntry ; + +ontolex:evokes :k-t-b_meaning; + +rdfs:label "k-t-b" . + +# CC: clarification question: is the suggestion to not provide a lexical form? + +# MI: for this particular lexical entry? + +# CC: yes + +:k-t-b_meaning a ontolex:LexicalConcept. + +# not: LexicalSense + +:kiteb a ontolex:Word ; + +lexinfo:partOfSpeech lexinfo:verb ; + +morph:morphologicalPattern ; + +ontolex:canonicalForm ; + +morph:baseForm . + + a ontolex:Form ; + +morph:consistsOf roots:k-t-b ; + +ontolex:writtenRep "kiteb"@mlt ; + +ontolex:phoneticRep "/kɪtɛp/" . + +NB: lexinfo:etymologicalRoot: lexical entry -> lexical entry; “Morpheme that +has a particular status with regards to the word's etymology.” + +2. Morph ordering + +One of the requirements. + +morph:consistsOf , + +(this is RDF, unordered) + +Diagram: diamond-symbol operator (that’s standing for rdfs:Seq) + +:x morph:consistsOf :m1, :m2. + +:x a rdfs:Seq; rdf:_1 :m1; rdf:_2 :m2. + +We could debate whether to use Seq (= previously preferred) or List (= also +discussed) + +(for circumflexes, we either need to decompose or just to provide +morph:consistsOf) + +3. AoB + +We move the call by 15 minutes! Next call: 03.05.2023, 13:15 CEST. + +4. Chat (restructured, and filtered but not changed) + +4.1 Arabic (RootMorph) + +Fahad Khan + +13:13 + +https://en.wiktionary.org/wiki/%D9%83_%D8%AA_%D8%A8#Arabic + +Fahad Khan + +13:30 + +I also don't think consonantal root should be a lexicalconcept, since it also +plays a role in derivational morphology + +Christian Chiarcos + +13:31 + +what is the practice in dictionaries? organization by root or individual words? +I guess in Malti, the latter. What about Arabic? + +(@Gilles: yes, lexical entries in a dictionary can be used for a family of +related words also in english) + +Gilles Sérasset + +13:32 + +@Christian, but ontolex:LexicalEntry has a more narrow sense. lexicog has the +mean to represent this. + +But sorry, I do have to leave now. Sorry sorry sorry, this discussion is very +interesting. + +Fahad Khan + +13:33 + +arabic dictionaries are organised on the basis of roots: https://lingualism.com +/modern-standard-arabic/using-an-arabic-dictionary-tips-for-learners/ + +Gilles Sérasset + +13:34 + +@Fahad: yes they are ! and chinese ones on the basis of ideographic keys... + +4.2 other vowel alterrnations (under the same entry or two allomorphs?) + +Christian Chiarcos + +13:26 + +as for vowelalternations without 1:1 correspondence to meaning/function, we +actually have something similar in IE, too, ablaut. it also doesn't have a +transparent function. English fly, flew, flown + += German fliegen - flog -geflogen + +but it has other functions, e.g., abstractiuve: Berg (mount) - Gebirge +(mountain) + +Fahad Khan + +13:27 + +but it isn't systematic enough that dictionaries that aren't etymological +dictionaries are organised in terms as roots + +in IE languages + +Fahad Khan + +13:28 + +*in terms of roots + +Christian Chiarcos + +13:28 + +@Fahad: ablaut: no, not in English, of course. + +Christian Chiarcos + +13:36 + +cf. gabaurths in http://www.wulfila.be/lib/streitberg/1910/html/B041.html, +contains a MWE als sub-entry + +Besim Kabashi + +13:35 + +@Christian: German "fliegen - flog -geflogen" are forms from different +"stem-allomorphs" … + +Christian Chiarcos + +13:36 + +@Besim: this is one way of describing them + +Christian Chiarcos + +13:49 + +allomorphs are variation (=> vartrans) + +Christian Chiarcos + +13:50 + +(differtent modelling strategies, because source documents are too +heterogeneous wrt. allomorphs, som we must supoport both morph-level and +morpheme-level organization; hence Morph instead of morpheme) + +NB: RootMorph entails that :ktb is a lexical entry + +Fahad Khan + +13:50 + +yes we discussed this earlier + +because Morph is a subclass of LexicalEntry + +so every morph is a lexical entry by default + +4.3 MI’s suggested RootMorph modelling + +Fahad Khan + +13:36 + +is rootmorph already in lexinfo? + +Matteo Pellegrini + +13:37 + +I don't think so + +Christian Chiarcos + +13:37 + +but it's suggested for addition + +You + +13:37 + +This is a lexical entry, so we can connect it to the lexical concept + +Christian Chiarcos + +13:38 + +modelling: +1 + +Fahad Khan + +13:39 + +maybe LexicalConcept can be related to the root + +Christian Chiarcos + +13:39 + +lexical concept: +1 + +4.4 lexinfo;etymologicalRoot (suggested by Fahad) + +Christian Chiarcos + +13:42 + +etymological root is not morphological, I think. can be multi-morpheme + +moment + +Fahad Khan + +13:43 + +i assume etymological root is a kind of root so it should be a morpheme + +etymon is more general + +Christian Chiarcos + +13:45 + +lexinfo:etymology (=> skos:definition) + +etymnologivcalroot: in lexinfo 3.0, it is indeed a morpheme! + +Christian Chiarcos + +13:46 + +with lexinfo:etymologicalRoot, we can connect root and derived words as lexical +entries + +Fahad Khan + +13:49 + +mmoon seems to be down + +https://github.com/MMoOn-Project/MMoOn + +4.4 lexinfo:RootMorph => ontolex:LexicalEntry + +Christian Chiarcos + +13:51 + +let's add that exoplicitly to the exam,ple + +Besim Kabashi + +13:51 + ++1 + +Fahad Khan + +13:52 + +we need to change the wording of Lexical Entry in the report + +Fahad Khan + +13:52 + +"A lexical entry represents a unit of analysis of the lexicon that consists of +a set of forms that are grammatically related and a set of base meanings that +are associated with all of these forms. Thus, a lexical entry is a word, +multiword expression or affix with a single part-of-speech, morphological +pattern, etymology and set of senses." + +4.5 order of morphemes + +Christian Chiarcos + +13:52 + +actually, we did + +Christian Chiarcos + +13:52 + +aggregation + +hence the diamond in the diagram + +what isa not decided is whether to use rdfs:seQ or rdf:List + +Christian Chiarcos + +13:54 + +ordering + +consists of and the second edge + +rdfs:Seq + +order can also be indiredctly expressed in the wordformation rules + +Christian Chiarcos + +13:59 + +in generation, in the replacement we can match beginning or end of an +expression + +(i would thus decouple tjhe generation discussion from morph ordering) + +4.6 more data on Arabic? + +Christian Chiarcos + +14:00 + +(I thiunk we can include examples if they help to address requirements we +identified before and if the data we have is not sufficient) + diff --git a/minutes_txt/20_09_2023.docx.txt b/minutes_txt/20_09_2023.docx.txt new file mode 100644 index 0000000..5f34f33 --- /dev/null +++ b/minutes_txt/20_09_2023.docx.txt @@ -0,0 +1,88 @@ +Link: https://meet.google.com/nsj-tbcy-yop [CHECK HERE FOR UPDATED LINK(S)] + +Latest Definitions: https://github.com/ontolex/morph/blob/master/draft.md + +Nexus: https://nexuslinguarum.eu/the-action/join-us + +Participants [please add yourself]: + +Christian Chiarcos (CC) + +Max Ionov (MI) + +Katerina Gkirtzou (KG) + +Besim Kabashi (BK) + +Fahad Khan (FK) + +Khadija Ait ElFqih (KAE) + +Matteo Pellegrini (MP) + +Ciprian-Octavian Truică (CT) + +Penny Labropoulou (PL) + +Elena Simona Apostol (ESA) + +Sina Ahmadi (SA) + +Elena Benzoni (EB) + +Petra Steiner (PS) + +Theodorus Fransen (TF) + +Ranka Stanković (RS) + +Gilles Sérasset (GS) + +Mike Rosner (MR) + +Table of Contents + +0. Module draft (4.17) + +1. Requirements + +2. Datasets + +3. AoB + +0. Module draft (4.17) + +[image1] + +1. Replacement → RegexReplacement + +FK: morph:source and morph:target being string literals can be too strictly +defined given that the model will not be able to change after publication. What +if we relax this restriction, move specifics about regular expressions into +addenda. + +… some discussion … + +MI: Suggestion — having morph:Replacement underdefined, a new subclass +morph:RegexReplacement which has the properties morph:Replacement has now. + +2. Plans + +Next time (04.10): + + • + Everyone whose issues are on GH are either present or we know the + status + + • + Going through unmet requirements + + • + Going through problematic datasets + +Time after next (18.10): + + • + MP presents the new standard for morphology, its compatibility with + Morph. + diff --git a/minutes_txt/20_10_2021.docx.txt b/minutes_txt/20_10_2021.docx.txt index 689b3f4..06b2706 100644 --- a/minutes_txt/20_10_2021.docx.txt +++ b/minutes_txt/20_10_2021.docx.txt @@ -13,7 +13,7 @@ Christian Chiarcos CC 1. Module draft 4.5 -[image2] +[image3] Included adaptations: @@ -81,9 +81,9 @@ BK proposal: [image4] -[image1] +[image2] -[image3] +[image1] :MorphPattern → morph:InflectionType, morph:WordformationRelation diff --git a/minutes_txt/22_03_2023.docx.txt b/minutes_txt/22_03_2023.docx.txt index 608746b..3d9200f 100644 --- a/minutes_txt/22_03_2023.docx.txt +++ b/minutes_txt/22_03_2023.docx.txt @@ -99,7 +99,8 @@ MWE Chapter UniDive • - (no publication, indirectly related to MWE Chapter) + (no publication, indirectly related to MWE Chapter: the MWE editors are + working group leads for corpus-lexicon interface) • Christian has been asked to co-lead a task on modelling MWE @@ -257,7 +258,9 @@ in joint papers TODO, partially on Besim) □ - E.g., Old Irish (TF) + E.g., Old Irish (TF). See here. The contents in the document and + accompanying spreadsheet are possibly equally (or more) pertinent + to “Current open topics” above. □ Greek (Penny) @@ -276,6 +279,7 @@ in joint papers • Serbian (Ranka): paper for LDK - football use case, work in progress + ttl (work in progress) • German (Petra Steiner)? diff --git a/minutes_txt/24_09_2019.docx.txt b/minutes_txt/24_09_2019.docx.txt index 7de7d97..3562e44 100644 --- a/minutes_txt/24_09_2019.docx.txt +++ b/minutes_txt/24_09_2019.docx.txt @@ -66,7 +66,7 @@ morph:paradigm <#finnish_noun_type_9> ; morph:example "kissojen"@fi . -[image1] +[image3] • SPARQL for forms generation: @@ -151,7 +151,7 @@ BIND(REPLACE(?w_midJ, ?s_end, ?t_end) as ?w_end) order by ?start ?end -[image3] +[image2] <#finnish_noun_type_kala_nom> a morph:Rule ; @@ -192,7 +192,7 @@ Morphology Module Draft 3.0 InflectionType denotes a step that adds a certain grammatical meaning for the group of words which add the same affixes. It is a declension type without -allomorphy.[image2] +allomorphy.[image1] Bettina: create new diagram which illustrates the generation process as a whole (input - transformation - output) diff --git a/minutes_txt/26_07_2023.docx.txt b/minutes_txt/26_07_2023.docx.txt new file mode 100644 index 0000000..118049a --- /dev/null +++ b/minutes_txt/26_07_2023.docx.txt @@ -0,0 +1,252 @@ +Link: https://meet.google.com/nsj-tbcy-yop [CHECK HERE FOR UPDATED LINK(S)] + +Latest Definitions: https://github.com/ontolex/morph/blob/master/draft.md + +Nexus: https://nexuslinguarum.eu/the-action/join-us + +Participants [please add yourself]: + +Christian Chiarcos (CC) (excused for being 10 min late) + +Max Ionov (MI) + +Katerina Gkirtzou (KG) + +Besim Kabashi (BK) + +Fahad Khan (FK) + +Khadija Ait ElFqih (KAE) + +Matteo Pellegrini (MP) + +Ciprian-Octavian Truică (CT) + +Penny Labropoulou (PL) + +Elena Simona Apostol (ESA) + +Sina Ahmadi (SA) + +Elena Benzoni (EB) + +Petra Steiner (PS) + +Theodorus Fransen (TF) + +Ranka Stanković (RS) + +Gilles Sérasset (GS) + +Mike Rosner (MR) + +Table of Contents + +0. Module draft (4.17) + +1. Paradigm class + +2. Finite State terminology + +3. Open questions from the LDL paper + +4. W3C day presentation + +Next time + +0. Module draft (4.17) + +[image1] + +0. MWE paper + +TODO@MI, CC: Finish the changes, submit the chapter until the 31.07 + +1. Paradigm class + +MP: Naming “paradigm” may evoke incorrect associations, definition of paradigm +in linguistics as a set of word forms + + Note: we also need to rephrase the “paradigm” property, then + + • + morphological pattern + + • + Inflection type? + + • + Inflection class? + + □ + MP: +1 + + □ + GS: sounds like owl class + + • + Inflection? + + => hard to get to a consensus, vote? + +MP: paradigm → rules instead of the other way around + +option 1: + +morph:Paradigm > morph:InflectionClass, morph:paradigm > morph:inflectionClass + +option 2: + +morph:Paradigm > morph:MorphologicalPattern, morph:paradigm > +morph:isInflectionRuleOf (or morph:InflectionRule, changing the direction of +the arrow (i.e., going from morph:MorphologicalPattern to morph:InflectionRule +rather than the opposite) + +MI: This was exactly the idea behind the current class. Maybe just rephrase the +definition? (also, see MP remarks) + +Cf. requirement N4: Lexical data contains pointers to and/or tables of +inflectional paradigms or classes including the respective stem affixes or the +full word-forms. Both, the pointers to paradigms and the interconnection of +word-forms that belong to a paradigm, should be representable. + +MI: No problem with option 2, we need to find a way to see if everyone else is +on board. + +MP: Slight preference with option 1, but option 2 is also okay. + +MI: Maybe rethink InflectionType name (InflectionSlot may be more intuitive) +and then go for option 1 + +TODO@MI: discuss this CC, how to best come to a decision within the community + +2. Finite State terminology + + • + Some of the aspects of the module allow it to model FST, albeit it + might feel like a misuse (e.g. InflectionType as a state) + + • + Finite state terminology + + □ + Options: either within morph or within the scope of a new module? + + ☆ + If the latter: + + ○ + Together with diachrony? + + ○ + Together with sound classes? + + • + MI: Probably best keep it the way it is and FSTs can be implemented in + a different module or another vocabulary outside of OntoLex, since it’s + out of scope kinda + + □ + KG: +1 + + • + CC: Treat it like the sound classes/character sounds — a (small) + vocabulary out of morph, out of scope but motivated by morph + +3. Open questions from the LDL paper + + • + - Variation in inflexion/Flexemes :: use to dialectal, diachronic or + simply orthographic variation. Examples + + □ + In Latin, lavo `wash can be inflected according to either the 1st + (lavare) + + or the 3rd (lavere) conjugation. + + • + Suppletion, eg in Old English, the verb wesan `to be' whose infinitive + + represents one underlying root, whereas its indicative present singular + forms + + are based on two other roots (eom 1.sg. `(I) am'; bist 2.sg. `(you) are'). + + • + Modern Greek, `τραίνο' and `τρένο' have the same meaning and syntactic + behaviour, so they can be modeled as the same ontolex:LexicalEntry, + where the inflected forms of each are grouped together instead of all + of them being represented as simply ontolex:otherForm. The current + proposal is to introduce a new relation for orthographic variation (\ + onto{lexis:OrthVariant}) as a subclass of \onto + {vartrans:LexicalRelation} and relate the orthographic variants through + this relation. + + • + MP: we can treat it as resolved. MP used Flexemes in his modeling but + that was dataset-specific, and not necessary for the Morph module + + See the slides here: https://zenodo.org/record/8036909 + + - Markers of morphological variation :: labels of style, dating, dialect, + etc. + + • + Resolved by transfering the issue to LexInfo Vocabulary + + • + Note :: it would be desirable if the OntoLex-Morph vocabulary would + eventually be accompanied by best practice recommendations for the + assignment of markers and provenance. + + - Challenges in word formation :: + + • + not fully predictable phonological processes like assimilation or + apophony, which prevent the simple juxtaposition of formative elements + from generating the actual surface form of derivatives; + + • + formal and semantic constraints that make a word formation rule not + applicable to all the words in the lexicon with a specific part of + speech. + + (morph:GrammanticalMeaning??) + + • + CC: morph:baseConstraint should have resolved this + +4. W3C day presentation + + • + https://www.w3.org/community/ontolex/wiki/W3c_community_day_@_LDK2023 + +CC: We can’t come up with a list of requirements right now because we still +have open questions + + • + TODO@MI: go through all the requirements and give an overview of what + is left + + • + TODO@CC, MI: discuss what is left + + • + Still TODO + +Next time + + • + Summer break + + • + Close open TODOs, especially, the ones from the last section: CC&MI to + discuss what’s left and compile a list + + • + Restarting before LDK? 6.09? 20.09? + + □ + FK: We have Nexus plenary, so not 6.09 + diff --git a/minutes_txt/27_09_2022.docx.txt b/minutes_txt/27_09_2022.docx.txt new file mode 100644 index 0000000..af20178 --- /dev/null +++ b/minutes_txt/27_09_2022.docx.txt @@ -0,0 +1,990 @@ +Link: https://meet.google.com/nsj-tbcy-yop [check here for link updates if it +doesn’t work] + +Latest Definitions: https://github.com/ontolex/morph/blob/master/draft.md + +Latest Paper (LDL-2022): https://www.overleaf.com/4868363189kczjzdndgxwc +(folder submission/) + +Participants [please add yourself]: + +Christian Chiarcos (CC) (30 min late, excused) + +Max Ionov (MI) + +Katerina Gkirtzou (KG) + +Fahad Khan (FK) + +Matteo Pellegrini (MP) + +Ciprian-Octavian Truică (CT) + +Penny Labropoulou (PL) + +Elena Simona Apostol (ESA) + +Sina Ahmadi (SA) + +Elena Benzoni (EB) + +Petra Steiner + +Agenda (please add, but do not edit table of contents directly, but add +sections below and then update here): + +copied from last time + +0. Module draft 3 + +1. Publications 3 + +2. definition consolidation 4 + + 2.2 overview 5 + + 2.3 replacement (wrapup) 5 + + 2.4 InflectionType 6 + + current model 6 + + alternative 1: detach InflectionType 6 + + alternative 2: replace InflectionType by GrammaticalMeaning 6 + + alternative 3: merge InflectionType with InflectionRule 7 + + 2.5 related standards 8 + +3 open problems/other data 8 + + 3.0 extend documentation / draft 8 + + 3.1 Comparison with MMoOn (Mod. Greek, Hebrew, other; unassigned) 9 + + 3.2 baseConstraint + grammatical Meaning + baseType 9 + + 3.3 Samples to be modelled (all) 10 + + 3.4 inflection tables (Fahad, others?) 10 + + 3.5 semitic consonantal roots (unassigned) 11 + +3. AOB 11 + +0. Module draft + + draft 4.15 (no updates) + +[image1] + +Model draft 4.16 updates (to be discussed) & open issues: + + • + inflection type to be discussed + + • + overall, it seems people have least issues with alternative 1 + + □ + CC: alternative 1 or 2 for agg. languages (0 and 3 => combinatorial + explosion) + + □ + MP: alternative 1 (or 3) (0 fails, 2 has terminological issues) + + □ + KG: need link with GrammaticalMeaning + + □ + on-going discussion: rename GrammaticalMeaning (esp., if used for + slots or finite states)? + + • + after telco: + + □ + CC: alternative 1 with KG-requested additions suggested as 4.16 + (not confirmed) + + ☆ + https://github.com/ontolex/morph/tree/master/doc/diagrams + + ☆ + InflectionType: alternative 1 + + ☆ + grammaticalMeaning: linked with inflection rule and inflection + type (with question marks, link with inflection type requested + by Penny&Katerina) + + ☆ + Rule resurrected (only to simplify diagram: holds properties + replacement, involves and examples, inherited by inflection + rule and word formation rule) + +1. Publications + + • + LLODREAM https://easychair.org/cfp/llodream2022 + + □ + Conference site : http://llodapproaches2022.mruni.eu/ + + □ + accepted (abstract), final paper for postproceedings only + + ☆ + https://docs.google.com/document/d/ + 1a7OWCgcD6qDYPta0shiIh6CzWTw7E1ptnwQKXVA4xDo/edit + + ☆ + main feedback: abstract doesn’t show use cases + + ☆ + presentation: duration tbd. + + ☆ + conference: Sep. 20,21 (tbc) => discuss at next meeting + + ☆ + Full paper submission by December 1st, 2022 + + ☆ + Camera-ready version to upload until Friday, 29th, 2022. + TODO@CC + + • + MWE volume + + □ + Accepted + + □ + multi-word expressions, see FrAC minutes (https://docs.google.com/ + document/d/1N2w_r6WLhFGESSMSUkG5FSROorXscDMQuB77qg9uDIA/edit# + heading=h.i84zrrbp06oy) + + □ + deadline January (full paper) + + □ + expression of interest and short abstract handed in: + + ☆ + describe and compare modelling of MWEs in OntoLex core, decomp, + FrAC, *and morph* (~compounding) + + ☆ + primarily designed as a FrAC paper, but input from morph + contributors would be welcome (LiLa?) + + ○ + Matteo&Elena: in principle interested, will look into that + + ☆ + cf. open issues from earlier minutes + + ○ + describe the relation between decomp and CompoundRelation + + ■ + suggestion: do this as part of writing a designated + paper [venue?] + + ■ + TODO@unassigned: document relation between both modules + in appendix + + ★ + there is an alternative reification with + decomp:Component, but this is less well-suited for + compound analysis, because it doesn’t relate to + lexicosemantic relations. + + ★ + the current modelling of decomp is oriented towards + an analysis of synsem (semantic) roles within a + compound. in morphology, we normally don’t have + that, what we have, instead, are relations between + lexemes and morphemes. + + • + future publications + + □ + any “natural choices” for a venue? + + ☆ + Petra: Derimo-2023 Workshop, Prague + + ○ + deadlines to be confirmed + + □ + ideas + + ☆ + paper on word formation? + + ○ + idea for novel paper: word formation in OntoLex-Lemon + + ■ + not original content, but more like a survey and + documentation of best practices? + + ■ + can be helpful to consolidate/revise word formation + part of the module + + ■ + possible input from LiLa + + ■ + TODO@all: think about possible venues + + ☆ + general OntoLex overview + + ○ + ?ESWC: Deadline? + + ■ + ESWC: 2 Dec 2021 for ESWC2022 -> there are no dates for + the ESWC2023 CFP as the ESWC2022 is between 29th May + and 2nd June + + ○ + update of OntoLex, incl. FrAC, Morph, MModality + + □ + later journal paper + + ☆ + After the final publication + + ☆ + Or: an overview of the current state. Frac + Morph or Frac + separately, Morph separately? + + ☆ + (at some point) a book? + +2. definition consolidation + + • + standing TODO@all: provide/refine/review definitions + + □ + under https://github.com/ontolex/morph/blob/master/draft.md + + □ + technical definitions, but linguistic explanation (“definition”) in + text + + □ + @all: you can contribute suggestions by creating issues (https:// + github.com/ontolex/morph/issues), via pull requests, or by direct + editing (share your GitHub username) + + □ + procedure: for definition refinement: + + ☆ + open an issue + + ☆ + pull request + close the issue + + • + status: + + □ + pull requests by Matteo and Penny merged + + □ + draft contains now a number of notes + + □ + more consolidation needed + + □ + todo: revise/confirm morpheme typology in lexinfo (suggested by + Sina) + +2.2 overview + + • + continue model overview for Elena B., Sina and Petra: Inflection part + + □ + Max: morphs are not morphemes, but can also be extrapolated from + fullform dictionary (~allomorphs) + + □ + Petra: this is like Harris-Firth approach? Distributionalism? + + □ + CC: definitions are technical (= self-contained), not linguistic. + we try to avoid commitments to any specific theory, concepts should + still be intuitively comprehensible by linguists + + □ + Max Ionov: In this specific use case, almost. But we try to stay + approach-independent. We just want to provide means to model _any_ + data + + □ + CC: two main scenarios + + ☆ + model an existing morpheme inventory or rule set (e.g., for + generation, then, morphs are not automatically created) + + ☆ + induce morphs/rules and store them (and then use them as in the + first scenario for morphological generation) + + □ + inflection type may be need to be revised + +2.3 replacement (wrapup) + +conventions for replacement correspond to those of pattern matching/replacement +in SPARQL, as formally defined + + • + in SPARQL 1.1 (https://www.w3.org/TR/sparql11-query/), which points to + + □ + the XPath function replace (https://www.w3.org/TR/xpath-functions/# + func-replace), and + + □ + the XPath regex syntax (https://www.w3.org/TR/xpath-functions/# + regex-syntax) + +A more readable, informal description under + + • + https://en.wikibooks.org/wiki/SPARQL/Expressions_and_Functions#REGEX + +Note that in the formal syntax definition, “\” is used to mark special +characters. However, as most SPARQL engines are Java-based and Java uses “\” as +an internal escape symbol, you actually have to write “\\” instead of “\” as +defined in the syntax. A literal single “\”-character in a regex must thus be +double escaped (i.e., “\\\\”). + +Note: This syntax originates from regular expressions in Perl (https:// +perldoc.perl.org/perlre). + +Except for minor differences in escaping and special characters, this is +equivalent to + + • + the syntax of regular expressions in Java (https://www.w3schools.com/ + java/java_regex.asp) + + • + the syntax of regular expressions in Sed (https://www.gnu.org/software/ + sed/manual/html_node/Regular-Expressions.html) (and other Unix + command-line tools, e.g., grep) + +2.4 InflectionType + + • + current definitions: + + □ + Class morph:InflectionType represents a single slot for a single + grammatical category for all its possible values (e.g. all the + cases) + + ☆ + Book analogy: a column from a paradigm table without allomorphy + /alternative variants for just a single morpheme + + □ + property morph:inflectionType assigns an inflectional pattern of a + form as belonging to a morphological pattern of a lexical entry + + • + CC (offline): this definition does not work for the current diagram, if + one inflection type represents the position for *all* cases, we cannot + associate the form for, say, dative with the rule for dative via + inflection type (thanks to Matteo for pointing that out). + + □ + https://github.com/ontolex/morph/issues/11 + + • + Comparing alternatives: + +current model + + 1. + Form -inflectionType-> InflectionType + + 2. + Paradigm <-paradigm- InflectionType + + 3. + InflectionType -inflectionRule-> InflectionRule + + 4. + InflectionType -next-> InflectionType + + • + alternative 0: keep current model, one inflection type per paradigm and + rule + + pro: backward-compatible + + con: unneccessarily verbose: what is the difference to inflection rule + then? + + con: still contradicts current definition + +alternative 1: detach InflectionType + + 1. + Form -inflectionRule-> InflectionRule + + 2. + Paradigm <-paradigm- InflectionRule + + 3. + InflectionRule -inflectionType-> InflectionType + + 4. + InflectionType -next-> InflectionType + + pro: we basically keep all the information we have, incl. finite state + modelling and agglutination + + con: inflection type won’t be used for fusional languages and probably fall + out of use + + con: terminologically, the finite state use case is still a bit of a + stretch, a better name? + + note: paradigms should be allomorphy-free, then (this is at odds with + traditional usage of “paradigm”. in inflection tables, it normally includes + allomorphic variants. + +alternative 2: replace InflectionType by GrammaticalMeaning + + a. + Form -inflectionRule-> InflectionRule + + b. + Paradigm <-paradigm- InflectionRule + + c. + InflectionRule -grammaticalMeaning-> GrammaticalMeaning + + d. + GrammaticalMeaning -next-> GrammaticalMeaning + + pro: we basically keep all the information we have, incl. finite state + modelling and agglutination + + pro: we eliminate one class and we address a feature request by Penny + + pro: slot information can be plausibly a part of grammatical meaning (or, + better, structure) + + con: no explicit data structures for slots, researchers would need to + “discover” that from comments => rename next to nextSlot? + + con: for FST, this is very opaque, a better name? => we could introduce a + designated subclass “FiniteState” of GrammaticalMeaning !? + +alternative 3: merge InflectionType with InflectionRule + + a. + Form -inflectionRule-> InflectionRule + + b. + Paradigm <-paradigm- InflectionRule + + c. + InflectionRule -grammaticalMeaning-> GrammaticalMeaning + + d. + InflectionRule -next-> InflectionRule + + pro: we keep all the information we have, incl. finite state modelling and + agglutination + + pro: we eliminate one class and address a feature request + + pro: “rule” is more relatable to what a finite state does than “inflection + type” (which sounds static) + + con: no explicit data structures for slots, researchers would need to + “discover” that from comments + + con: in agglutinating languages, the sequence is not over replacement + rules, but classes of morphemes, so we lack a formal data structure for + slots + + con: for FST, this conflates states and replacements, normally one state + can have different replacements (“rules”) + +Penny+Katerina (summary of last call, tests for fusional language): + + • + all alternatives express the neccessary information (if a direct link + with grammatical meaning is added) + + • + prefer alternative 2 + + □ + alternatives 1 and 3 are equivalent if a direct link with + grammatical meaning is added + + □ + alternatives 1-3 preferred over current model in terms of verbosity + + • + CC: that corresponds to my personal preference, too + + • + CC: minor refinements (to be discussed after applicability to + agglutinative language has been shown) + + • + rename GrammaticalMeaning to “Features” (or “FeatureBundle”; a “slot” + is described as a bundle of features, so that makes sense, and finite + states are informally associated with some kind of function, but + typically not a specific grammatical meaning, esp. for + morphophonological processes) + + • + introduce a subclass FiniteState of FeatureBundle (we would informally + capture the finite state itself as a feature, and the bundle would + consist of exactly one such feature) + +confirm on agglutinative languages + + • + sample data (Turkish) from Christian on GitHub under data/agglutinating + /turkish.md + + □ + to be discussed in detail next time + + □ + preference (in terms of verbosity) for alternatives 1 or 2 + + □ + alternatives 0 and 3 lead to combinatoric explosion + + • + Matteo: + + □ + strong preference against alternative 0 + + □ + others are unproblematic + + □ + not happy with slots as “grammatical meaning”, could mismatch + + ☆ + Christian: we can rename, see remarks from last time + + □ + preference to alternative 1 (or 3) + + • + after discussion: + + □ + CC: possible workaround would be to create a class FeatureBundle + with sub-classes GrammaticalMeaning, Slot and FiniteState + + ☆ + “inflection type” caused a lot of misunderstandings, so maybe + use “Slot” instead, and really only for slots. + + ☆ + “next” formally defined for Feature bundle, but is relevant for + Slot and FiniteState only + + ☆ + property “grammaticalMeaning” needs to be renamed then, too, + maybe “morph:feats”, subproperties grammaticalMeaning (range is + GrammaticalMeaning) and inflectionType (range is Slot) + +2.5 related standards + + • + Matteo: paralex standard for morphological lexicons (currently under + development by Sacha Beniamine and Erich Round) + + □ + to be discussed ASAP + +3 open problems/other data [postponed] + +3.0 extend documentation / draft + + • + OPEN: define cardinality restrictions: https://github.com/ontolex/morph + /issues/12 + + □ + suggestion: when finalizing the vocabulary + + • + CHECK STATUS: define morph subclasses in LexInfo rather than + OntoLex-Morph, also add equivalence axioms (lexinfo:Prefix subclassOf [ + lexinfo:termElement lexinfo:prefix ]) + + □ + https://github.com/ontolex/lexinfo/pull/29 + + □ + not merged yet + + • + describe grouping of lexical (sub-) entries + + □ + LiLa: “flexeme”, sub-entries with different paradigms, but + identical in meaning, etc. + + ☆ + suggestion: model the grouping by lexicog, have both the + overarching lexical entry and the flexemes as separate lexical + entries, no vocabulary extension needed, but a usage note in + the report + + ☆ + tbc: by LiLa + + □ + Penny: sub-entries of the same lexical entry to mark contracted and + non-contracted versions of the same paradigm + + ☆ + can be partially modelled by means of “markers”, i.e., lexinfo + usage properties, instead + + ☆ + todo@Penny: tbc. whether lexinfo needs to be extended for that + + ○ + domain: LexicalSense + + ○ + TODO: ask John + + ○ + if these properties are added, no sub-groups necessary + + • + @all: think about metadata properties for LexInfo (hypothetical/ + unattested form, etc.) => tentative consensus, but details to be + discussed + + □ + Penny: could work, but domain is ontolex:LexicalSense. Can this be + changed? + + □ + TODO: ask John + + • + Sample data for cliticization + + □ + OLD_TODO@Sina: provide sample data, maybe we can come up with a + recommendation + + □ + cf. Italian: https://en.wiktionary.org/wiki/andiamoci, https:// + en.wiktionary.org/wiki/andarsene + + □ + farcela: https://dizionario.internazionale.it/parola/farcela + + • + Sample data for reduplication? + + □ + mentioned by Sina last time + + □ + tentative consensus: no special vocabulary needed, but should be + confirmed on sample data + +3.1 Comparison with MMoOn (Mod. Greek, Hebrew, other; unassigned) + + • + Greek : https://link.springer.com/chapter/10.1007/978-3-030-98876-0_34 + + • + Bettina’s data (link?) + + • + not directly comparable, current Gk. data is inflectional + +3.2 baseConstraint + grammatical Meaning + baseType + + • + Sina (sample date from Central Kurdish) + + :ish_morph a clitic / endoclitic; # lexinfo:Clitic ??? + + ontolex:sense [ skos:definition “too, also” ]; + + ontolex:canonicalForm [ ontolex:writtenRep “îş” ]; + + morph:baseConstraint [ :pos "v", "n", "a" ] . + + :sh_morph a ontolex:Affix, lexinfo:Suffix . # allomorph of “îş” + + • + Christian: sample data from Inuktitut (GDrive) (https://github.com/ + ontolex/morph/raw/master/data/gdrive/Polysynthetic_Inuktitut.docx) + + atausiulugu + + {atausi:atausiq/1n}{u:u/1nv}{lugu:lugu/tv-part-1s-3s-fut} + + {one}{existence; is}{part. future: while I ...him/her/it} + + :atausiq_le a ontolex:LexicalEntry; + + ontolex:sense [ skos:definition “one” ]; + + ontolex:canonicalForm [ a ontolex:Form; ontolex:writtenRep “atausiq” ]; + + ontolex:baseForm [ a ontolex:Form; ontolex:writtenRep “atausi”; + morph:grammaticalMeaning “n” ]. + + :u_morph a ontolex:Affix, lexinfo:Suffix; + + ontolex:sense [ skos:definition “existence; is” ]; + + ontolex:canonicalForm [ ontolex:writtenRep “u” ]; + + morph:baseConstraint [ :pos “n” ]; # from “1nv” + + morph:grammaticalMeaning [ :pos “v” ]. # from “1nv” : convert noun into + verb + + # note: this generates a stem, not a word, it only becomes a proper word + with inflection: + + :lugu_morph a ontolex:Affix, lexinfo:Suffix; + + ontolex:sense [ skos:definition “while …” ]; + + ontolex:canonicalForm [ ontolex:writtenRep “lugu” ]; + + morph:baseConstraint [ :pos “v” ]; # from “tv-” + + morph:grammaticalMeaning [ :pos “v-part-1s-3s-fut” ]. + + :atausiu_le a lexinfo:Stem, ontolex:LexicalEntry; + + ontolex:sense [ skos:definition “so. will be sth.” ]; + + ontolex:baseForm [ ontolex:writtenRep “atausiu” ]; + + # this is not a free-standing lexeme, and there is no canonical form that + is a word + + ontolex:lexicalForm [ ontolex:writtenRep “atausiulugu” ; + + rdf:_1 :atausiq_le; + + rdf:_2 :u_morph; + + rdf:_3 :lugu_morph ]. + +this is purely descriptive, no explicit rules written, but baseConstraint and +grammaticalMeaning allow to check consistency conditions. + +Note that Inuktitut inflection involves some level of assimilitation, this is +modelled here by means of baseForm (dropping of -q), but the contexts are not +marked explicitly. + +We could easily model that if morph:baseType is in morph:GrammaticalMeaning +rather than in ontolex:Form. That would be necessary if a particular base form +is only generated by a morpheme rather than given for a root/stem. + +3.3 Samples to be modelled (all) + + • + most sample data originally on GDrive (where is the link?) + + □ + now (also) on GitHub: https://github.com/ontolex/morph/tree/master/ + data/gdrive + + ☆ + CC: can we fully move there? + + • + samples @ GitHub + + □ + Latin (word formation variants?] ? + + • + postponed until Fahad has some progress on modelling + +3.5 semitic consonantal roots (unassigned) + + • + from the same consonant cluster, we can generate different POSes + + • + cf. https://en.wikipedia.org/wiki/K-T-B, https://en.wiktionary.org/wiki + /%D9%83_%D8%AA_%D8%A8) + + □ + this cannot (always) be modelled as inflection, as OntoLex requires + (at most) one POS per lexical entry + + □ + note that this page describes vowelized words as “derivatives”: can + we model this as derivation ? (but the process occurs in + inflection, too) + + ☆ + given a real dictionary, can be easily distinguish derivation + and inflection? + + • + cf. Arabic example from https://en.wikipedia.org/wiki/ + Dictionary_of_Modern_Written_Arabic (from Max) + + □ + dictionary organized by roots, but root is not made explicit + + □ + todo@unassigned: put an example into GitHub + + • + discussion postponed until we have a Semitic speaker + + □ + Ilan? + + ☆ + but first, check Bettina’s conversion of KDictionaries’ Hebrew + dict + +3. Agenda and TODOs for September + + 1. + Get sample Arabic data and try to model it + look into Bettina’s Hebrew + data (Sina + Max) + + 2. + Clitics in Kurdish data + + 3. + Clitics in Italian: https://dizionario.internazionale.it/parola/farcela + + reflexives in Spanish, prefixes in German — use as a single form, + writtenRep with a space + + 4. + Model agglutinative language data with different alternatives (Max) + + 5. + Ask Khadija if she will join us in September (Fahad) + +3. AOB + +next call in September + +7.09 +