Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic detection of style drifts and translation mismatches #354

Open
rousik opened this issue Apr 18, 2020 · 3 comments
Open

Automatic detection of style drifts and translation mismatches #354

rousik opened this issue Apr 18, 2020 · 3 comments
Labels
enhancement New feature or request ops operational needs translation

Comments

@rousik
Copy link
Collaborator

rousik commented Apr 18, 2020

When we do style changes or rework the layout of the site it can happen that the translated content drifts from the head. There may be different header depths, some css styling may not be applied. I have also encountered german translations where markdown markers were omitted entirely.

While we can probably catch some of these things manually it might be worth looking into whether we can detect structural differences between EN and translated content so that we can flag these strings for verification/correction.

I have looked into whether syntactic/structural analysis of markdown can be done but haven't found anything promising. Given that md files are converted into html before rendering, we might look into comparing the element trees of the rendered markdown fragments.

I know Lokalise has some features to highlight structural inconsistencies but it seems that it only captures html tags and doesn't fully take markdown syntax into account.

@rousik rousik added enhancement New feature or request ops operational needs translation labels Apr 18, 2020
@rousik
Copy link
Collaborator Author

rousik commented Apr 18, 2020

@emersonthis and @nditada for visibility

@rousik
Copy link
Collaborator Author

rousik commented Apr 20, 2020

Prototype of this script has been pushed to md-syntax-compare branch:
https://github.com/flattenthecurve/guide/blob/md-syntax-compare/_scripts/compare_md_structure.py

The next steps are:

  1. Run this script and figure out what is the right choice of tags to exclude/include
  2. Identify structural differences in translation files that need to be fixed
  3. See if this script could be run on language-preview branches to identify problems early

@rousik
Copy link
Collaborator Author

rousik commented May 5, 2020

I have been using this script to spot check markdown issues in newly launched languages and it seems to be helpful. I will need to document the best practices around using it in our processes and, ideally, see if we could also automatically integrate it with language PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request ops operational needs translation
Projects
None yet
Development

No branches or pull requests

1 participant