Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow localised/translated street type search #2944

Open
richlv opened this issue Jan 10, 2023 · 4 comments
Open

Allow localised/translated street type search #2944

richlv opened this issue Jan 10, 2023 · 4 comments

Comments

@richlv
Copy link

richlv commented Jan 10, 2023

Nominatim special phrases functionality allows translating object types (for example, "atm").

It would be great to also support translation of street types.
This would make the search more accessible, and reduce the incentive to translate all street names in all the languages.

For example, "Brīvības street" would find "Brīvības iela" - "iela" being Latvian for "street".

Perhaps some consideration might be needed for offset or circular words - for example:
"street" in Latvian is "iela"; in Lithuanian - "gatve".
In Latvian "gatve" means "avenue". In Lithuanian, "gatve" seems to be "alėja".
But then Latvian has "aleja", which also seems to translate to Lithuanian "alėja".

These might not be a concern at all, no clue how the implementation might work.

@lonvia
Copy link
Member

lonvia commented Jan 17, 2023

This problem exists not only for streets but pretty much all names in OSM. Another prominent example are the translations for administrative boundaries, e.g. in Poland.

So I'm thinking about replacing the special phrases (which are mainly for searching a POI by its type) with more generic category words. This is a rather vague idea at the moment but would work somehow like that:

  • Somewhere there is a list which states which tag combination (potentially in which language and or country) triggers a category word. For example, a Lativian name with highway=* has category words "iela", "gatve", "aleja".
  • When Nominatim imports data and parses an object that matches the tags in the list, it matches the name against category words and creates an additional artificial name with the category removed. So "Brīvības iela" would also get indexed as "Brīvības".
  • When searching, category words are detected as well and handled as optional. So the search for "Brīvības street" would match against the indexed "Brīvības" because 'street' is listed as a category word for highway=* in English.

This solution would solve some issues with translation, around the question if a category word is part of the name or not ('Hotel Bellevue' vs. 'Bellevue') and also issues in Spain, where the street category is preferably left out.

However, there are a whole can of unsolved questions for the realisation. Not least the problem that in some inflected languages you can't expect to just remove a word and have a sensible new name (look at the translation in the Polish example).

@richlv
Copy link
Author

richlv commented Jan 17, 2023

That sounds great - Estonians also leave out street category.
Not quite sure about the word removal resulting in a less descriptive name. Didn't grasp it from the Polish example. What would be an example of it being a major problem?

@i-ky
Copy link

i-ky commented Jan 17, 2023

in some inflected languages you can't expect to just remove a word and have a sensible new name

Would it be possible for Nominatim to run short code snippets (in Lua, Python, Javascript, whatever easily embeddable language) to come up with "artificial name" for indexing? I am pretty sure that Polish example ("powiat zgierski" → "Zgierz County") can be coded relatively easily.

@lonvia
Copy link
Member

lonvia commented Jan 17, 2023

Nominatim has a framework now for plugging in Python code to do arbitrary preprocessing of names. This would go somewhere in there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants