Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grammar notes update for Swift implementation #176

Merged
merged 2 commits into from
Nov 3, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 17 additions & 4 deletions Grammar.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ The quite universal and simplier solution is the changing street names with the
The required grammatical case should be specified right in instruction's substitution variables:

- `{way_name}` and `{rotary_name}` variables in translated instructions should be appended with required grammar case name after colon: `{way_name:accusative}` for example
- [languages/grammar](languages/grammar/) folder should contain language-specific JSON file with regular expressions for specified grammar case:
- [languages/grammar](languages/grammar/) folder should contain language-specific JSON file with regular expressions for specified grammatical case:
```json
{
"v5": {
Expand All @@ -28,9 +28,21 @@ The required grammatical case should be specified right in instruction's substit
- Instruction text formatter ([index.js](index.js) in this module) should:
- check `{way_name}` and `{rotary_name}` variables for optional grammar case after colon: `{way_name:accusative}`
- find appropriate regular expressions block for target language and specified grammar case
- call standard [string replace with regular expression](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace) for each expression in block passing result from previous call to the next; the first call should enclose original street name with whitespaces to make parsing words in names a bit simplier.
- call standard [string replace with regular expression](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace) for each expression in block passing result from previous call to the next; the first call should enclose original street name with whitespaces to make parsing several words inside name a bit simplier.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still a requirement, now that grammarize is called before capitalization and insertion (#170)?

- Strings replacement with regular expression is available in almost all other programming language and so this should not be the problem for other code used OSRM Text Instructions' data only.
- If there is no regular expression matched source name (that's for names from foreign country for example), original name is returned without changes. This is also expected behavior of standard [string replace with regular expression](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace). And the same behavior is expected in case of missing grammar JSON file or grammar case inside it.
- Grammar JSON could have [regular expression flags in JS notation](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp):
```json
{
"meta": {
"regExpFlags": "ig"
},
```
- Please note, not all JS regular expression flags could be supported in other languages.
For example, [OSRM Text Instructions for Swift](https://github.com/Project-OSRM/osrm-text-instructions.swift/) don't support "non-global match" and so always supposes `g` flag turned on.
So if some regular expressions suppose stopping after their match, please include `^` and/or `$` into patterns for exact matching or return "finished" string in replace expression without enclosing whitespaces.
- If there is no regular expression matched source name (that's for names from foreign country for example), original name is returned without changes.
This is also expected behavior of standard [string replace with regular expression](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace).
And the same behavior is expected in case of missing grammar JSON file or grammar case inside it.

### Example

Expand All @@ -50,4 +62,5 @@ Russian _"Большая Монетная улица"_ street from St Petersburg

- Russian regular expressions are based on [Garmin Russian TTS voices update](https://github.com/yuryleb/garmin-russian-tts-voices) project; see [file with regular expressions to apply to source text before pronouncing by TTS](https://github.com/yuryleb/garmin-russian-tts-voices/blob/master/src/Pycckuu__Milena%202.10/RULESET.TXT).
- There is another grammar-supporting module - [jquery.i18n](https://github.com/wikimedia/jquery.i18n) - but unfortunately it has very poor implementation in part of grammatical case applying and is supposed to work with single words only.
- Actually it would be great to get street names also in target language not from default OSM `name` only - there are several multi-lingual countries supporting several `name:<lang>` names for streets. But this the subject to address to [OSRM engine](https://github.com/Project-OSRM/osrm-backend) first.
- Actually it would be great to get street names also in target language not from default OSM `name` only - there are several multi-lingual countries supporting several `name:<lang>` names for streets.
But this the subject to address to [OSRM engine](https://github.com/Project-OSRM/osrm-backend).
37 changes: 15 additions & 22 deletions languages/grammar/ru.json
Original file line number Diff line number Diff line change
Expand Up @@ -370,12 +370,9 @@
["^ (\\d+)-е (\\S+[ео])е [Шш]оссе ", " $1му $2му шоссе "],
["^ (\\d+)-е (\\S+ье) [Шш]оссе ", " $1му $2му шоссе "],

[" Третому ", " Третьему "],
[" третому ", " третьему "],
["жому ", "жьему "],
["жой ", "жей "],
["чому ", "чьему "],
["чой ", "чей "]
[" ([Тт])ретому ", " $1ретьему "],
["([жч])ому ", "$1ьему "],
["([жч])ой ", "$1ей "]
],
"genitive": [
["^ (\\S+)ая [Аа]ллея ", " $1ой аллеи "],
Expand Down Expand Up @@ -652,21 +649,19 @@
["^ (\\S+ье) ([Пп]олу)?[Кк]ольцо ", " $1го $2кольца "],
["^ (\\S+[ео])е (\\S+[ео])е ([Пп]олу)?[Кк]ольцо ", " $1го $2го $3кольца "],
["^ (\\S+ье) (\\S+[ео])е ([Пп]олу)?[Кк]ольцо ", " $1го $2го $3кольца "],
["^ (\\d+)-е (\\S+[ео])е ([Пп]олу)?[Кк]ольцо ", " $1го $2го $3кольца "],
["^ (\\d+)-е (\\S+ье) ([Пп]олу)?[Кк]ольцо ", " $1го $2го $3кольца "],
["^ (\\d+)-е (\\S+[ео])е ([Пп]олу)?[Кк]ольцо ", " $1-го $2го $3кольца "],
["^ (\\d+)-е (\\S+ье) ([Пп]олу)?[Кк]ольцо ", " $1-го $2го $3кольца "],
["^ ([Пп]олу)?[Кк]ольцо ", " $1кольца "],

["^ (\\S+[ео])е [Шш]оссе ", " $1го шоссе "],
["^ (\\S+ье) [Шш]оссе ", " $1го шоссе "],
["^ (\\S+[ео])е (\\S+[ео])е [Шш]оссе ", " $1го $2го шоссе "],
["^ (\\S+ье) (\\S+[ео])е [Шш]оссе ", " $1го $2го шоссе "],
["^ (\\d+)-е (\\S+[ео])е [Шш]оссе ", " $1го $2го шоссе "],
["^ (\\d+)-е (\\S+ье) [Шш]оссе ", " $1го $2го шоссе "],
["^ (\\d+)-е (\\S+[ео])е [Шш]оссе ", " $1-го $2го шоссе "],
["^ (\\d+)-е (\\S+ье) [Шш]оссе ", " $1-го $2го шоссе "],

[" Третого ", " Третьего "],
[" третого ", " третьего "],
["жого ", "жьего "],
["чого ", "чьего "]
[" ([Тт])ретого ", " $1ретьего "],
["([жч])ого ", "$1ьего "]
],
"prepositional": [
["^ (\\S+)ая [Аа]ллея ", " $1ой аллее "],
Expand Down Expand Up @@ -943,21 +938,19 @@
["^ (\\S+ье) ([Пп]олу)?[Кк]ольцо ", " $1м $2кольце "],
["^ (\\S+[ео])е (\\S+[ео])е ([Пп]олу)?[Кк]ольцо ", " $1м $2м $3кольце "],
["^ (\\S+ье) (\\S+[ео])е ([Пп]олу)?[Кк]ольцо ", " $1м $2м $3кольце "],
["^ (\\d+)-е (\\S+[ео])е ([Пп]олу)?[Кк]ольцо ", " $ $2м $3кольце "],
["^ (\\d+)-е (\\S+ье) ([Пп]олу)?[Кк]ольцо ", " $ $2м $3кольце "],
["^ (\\d+)-е (\\S+[ео])е ([Пп]олу)?[Кк]ольцо ", " $1-м $2м $3кольце "],
["^ (\\d+)-е (\\S+ье) ([Пп]олу)?[Кк]ольцо ", " $1-м $2м $3кольце "],
["^ ([Пп]олу)?[Кк]ольцо ", " $1кольце "],

["^ (\\S+[ео])е [Шш]оссе ", " $1м шоссе "],
["^ (\\S+ье) [Шш]оссе ", " $1м шоссе "],
["^ (\\S+[ео])е (\\S+[ео])е [Шш]оссе ", " $1м $2м шоссе "],
["^ (\\S+ье) (\\S+[ео])е [Шш]оссе ", " $1м $2м шоссе "],
["^ (\\d+)-е (\\S+[ео])е [Шш]оссе ", " $ $2м шоссе "],
["^ (\\d+)-е (\\S+ье) [Шш]оссе ", " $ $2м шоссе "],
["^ (\\d+)-е (\\S+[ео])е [Шш]оссе ", " $1-м $2м шоссе "],
["^ (\\d+)-е (\\S+ье) [Шш]оссе ", " $1-м $2м шоссе "],

[" Третом ", " Третьем "],
[" третом ", " третьем "],
["жом ", "жьем "],
["чом ", "чьем "]
[" ([Тт])ретом ", " $1ретьем "],
["([жч])ом ", "$1ьем "]
]
}
}