-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apply grammar rules to token replacement values #49
base: master
Are you sure you want to change the base?
Conversation
Added support for the “grammaticalization” system in osrm-text-instructions. The Russian grammar file is now converted into a plist and used to inflect road names according to the cases specified in the instructions.
b0438a8
to
34a9e51
Compare
|
||
for rule in rules { | ||
let regularExpression = try! NSRegularExpression(pattern: rule[0], options: regularExpressionOptions) | ||
grammaticalReplacement = regularExpression.stringByReplacingMatches(in: grammaticalReplacement, options: [], range: NSRange(location: 0, length: grammaticalReplacement.characters.count), withTemplate: rule[1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this method replace all matches not the first only (JS RegExp g
flag is always on)? I hope this will not cause problems but this should be documented in OSRMTI Grammar feature description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the NSRegularExpression.stringByReplacingMatches(in:options:range:withTemplate:)
method always replaces all matches. Replacing just one occurrence would be a more manual affair, either finding the first matching substring and replacing it, or enumerating the matches but stopping the enumeration after the first iteration.
I was thinking that the presence of ^
in each of the regular expression patterns would avoid issues where a rule unexpected matches multiple times, but that would be a brittle assumption unless we remove the ^
anchors from the JSON file and add them to the patterns at runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, adding ^
at runtime is bad idea. So we just need to warn developers in Grammar description about implicitly turned on g
flag in Swift and perhaps add some tests for this
Upgraded to osrm-text-instructions v0.9.0. The only substantive change is that Russian grammar rules are now applied to a token replacement value before modification by the client and insertion into the overall instruction string.
The sole unit test for this feature doesn’t run by default: it only runs when the environment is in Russian, since that’s the only language for which grammar rules have been supplied so far.
Fixes #48.
/cc @bsudekum @frederoni @yuryleb