Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Triage OSM tag errors #25

Open
3 of 6 tasks
newsch opened this issue Aug 9, 2023 · 3 comments
Open
3 of 6 tasks

Triage OSM tag errors #25

newsch opened this issue Aug 9, 2023 · 3 comments
Milestone

Comments

@newsch
Copy link
Collaborator

newsch commented Aug 9, 2023

There are around 600 wikipedia/wikidata tags in the full planet dump that cannot be parsed by our current setup.
See #19 and #23 for more details.

They should all be values that don't conform to the expected format.
Some of these can be fixed on OSM by us, we can leave notes on others.

  • Add error for titles greater than 255 bytes of UTF-8
  • Add error for langs greater than some amount - check standard
  • Add a subcommand to dump the errors to disk in a structured way (started in Add osm tag file parsing #23).
  • Categorize them by solution
  • Add any new issues for parsing problems
  • Contribute fixes/notes to OSM
@newsch newsch mentioned this issue Aug 9, 2023
4 tasks
@newsch newsch added this to the v0.2 milestone Aug 9, 2023
@biodranik
Copy link
Member

You can create an issue and share a file with OSM community.

@newsch
Copy link
Collaborator Author

newsch commented Aug 9, 2023

Do you mean this forum? I haven't used it before but I will add a note: https://community.openstreetmap.org

newsch added a commit that referenced this issue Aug 9, 2023
- Add custom error types with `thiserror` crate in preparation for #25.
- Parsing errors are captured instead of logged to `warn` by default.
    - All parsing errors are still logged to `debug` level.
    - If >= 0.02% of tags can't be parsed, an error is logged.
    - TSV line errors are always logged as errors.
    - I/O errors will fail instead of be logged.

Signed-off-by: Evan Lloyd New-Schmidt <[email protected]>
@biodranik
Copy link
Member

It's better to ask on some chat channels like IRC or Telegram. Or make a post with a question in our chat, there are some OSMers who may help.

newsch added a commit that referenced this issue Aug 10, 2023
- Add custom error types with `thiserror` crate in preparation for #25.
- Parsing errors are captured instead of logged to `warn` by default.
    - All parsing errors are still logged to `debug` level.
    - If >= 0.02% of tags can't be parsed, an error is logged.
    - TSV line errors are always logged as errors.
    - I/O errors will fail instead of be logged.

Signed-off-by: Evan Lloyd New-Schmidt <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants