Skip to content

Commit

Permalink
🚚 Stop copying yaml content when a new language is added (#5823)
Browse files Browse the repository at this point in the history
This is a draft PR meant to collect feedback for a solution of issue #5187.

**Description of the problem**
When a new language is added in Weblate, the prefill add-on copies the English content into the new language yaml files. This means that when the English content changes, we manually need make the change in all language that do not have the content translated. Instead of this, we would like to keep in the yaml files only the content that has actually been translated.

**Yaml files**
All newly added content/yaml files are not prefilled with the current English translation and only contain the translated value for the corresponding language. Currently, only the om language has partial content to illustrate the solution. The content of each yaml file is merged with the en.yaml counterpart when the yaml is loaded.
- Schema validation. The schema definitions have explicit required fields and only the en.yaml files are checked against them. A python script generates new schema definitions (*.generated.schema.json) that lift the required fields constraints. The generation is part of the doit backend task and the schemas are gitignored. The generated schemas are used for all non-en.yaml files.
- Correctness check added in #4020. We have a script that checks if there are different array lengths between English and other languages. This check is amended to ensure the structure of the language yaml file is a subset of the English file. Now it check that:
  - The nodes in the 2 yaml files are of the same type, specifically arrays and dictionaries.
  - An array in a language file should not have more elements than the English file.
  - A dictionary in the language file should not have keys that are not present in the English file.
- Note that the merging mechanism silently discards the content of the language file if any of the correctness rules are not met. The assumption is that the English file will have the correct structure. This meant to be inline with the silent empty dictionary if an IOError occurs while loading the yaml file. However, we could have a different approach here: fail and just load the fallback? Or at least log that something went wrong?

**Po files**
Just like the yaml files, all newly added .po files will not be prefilled with English and, again, the om language is used as an example. The merging of every language with English happens through a custom function that defines a fallback. The regular gettext function is substituted with the custom one.

**Deduplication of existing languages**
The om language is added to this PR for exemplary purposes and is not an officially supported language. Since all current languages have a version of en.yaml files, the merging will not yield immediate results. It will take effect only when a new language is added. Perhaps it is wise to run a basic form of deduplication of the current files and test immediately. Note that complete deduplication of the yaml files is currently very hard to achieve because every yaml file copies a different version of the Enligsh content. However, we could simplify the problem by only removing the duplication that matches the current English version.

Co-Authored-By: Languages add-on <[email protected]>
Co-Authored-By: weblate <[email protected]>
  • Loading branch information
3 people authored Oct 29, 2024
1 parent a6c2040 commit 5a9b70d
Show file tree
Hide file tree
Showing 29 changed files with 335 additions and 109 deletions.
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -189,4 +189,7 @@ file_logger.json
# We submitted a PR to the package, but it's still wating to be approved
# Until then, we can include it in .gitignore
# More info in #4540
foo.txt
foo.txt

# Generated schema json files
*.generated.schema.json
8 changes: 6 additions & 2 deletions app.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,8 @@
from flask import (Flask, Response, abort, after_this_request, g, jsonify, make_response,
redirect, request, send_file, url_for,
send_from_directory, session)
from flask_babel import Babel, gettext
from flask_babel import Babel
from website.flask_helpers import gettext_with_fallback as gettext
from website.flask_commonmark import Commonmark
from flask_compress import Compress
from urllib.parse import quote_plus
Expand Down Expand Up @@ -443,6 +444,10 @@ def before_request_https():
Compress(app)
Commonmark(app)

# Explicitly substitute the flask gettext function with our custom definition which uses fallback languages
app.jinja_env.globals.update(_=gettext)


# We don't need to log in offline mode
if utils.is_offline_mode():
parse_logger = s3_logger.NullLogger()
Expand Down Expand Up @@ -2313,7 +2318,6 @@ def favicon():
@app.route('/index.html')
def main_page():
sections = hedyweb.PageTranslations('start').get_page_translations(g.lang)['home-sections']

sections = sections[:]

# Sections have 'title', 'text'
Expand Down
18 changes: 14 additions & 4 deletions build-tools/github/validate-yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,16 +23,26 @@ echo "------> Validating YAML"
# 'npx pajv validate' just hangs. Running the 'pajv' binary directly without the use of
# 'npx' does work... so we're just going to ¯\_(ツ)_/¯ and do that.

all_schemas=$(find content -name \*.schema.json)
schemas=$(find content -name \*.schema.json)

failures=false

for schema in $all_schemas; do
for schema in $schemas; do
dir=$(dirname $schema)
echo "------> Validating $(basename $dir)"

# The non-generated schema files that have required fields. They should be
# used for 'en.yaml' files. The generated schema files allow all fields to be
# optional, so they should be used to validate the rest of the yaml files.
if [[ $schema == *".generated."* ]]; then
files="*";
echo "------> Validating with optional fields $(basename $dir)/*.yaml"
else
files="en";
echo "------> Validating with required fields $(basename $dir)/en.yaml"
fi

# Run the validator.
if ! check-jsonschema -o text --schemafile $schema $dir/*.yaml > validate.txt; then
if ! check-jsonschema -o text --schemafile $schema $dir/$files.yaml > validate.txt; then
cat validate.txt || true
failures=true
fi
Expand Down
9 changes: 1 addition & 8 deletions content/adventures/adventures.schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,20 +3,13 @@
"type": "object",
"additionalProperties": false,
"properties": {
"title": {
"type": "string",
"description": "Short title of the adventure"
},
"subtitle": {
"type": "string",
"description": "Slightly longer introductory description of the adventure"
},
"adventures": {
"type": "object",
"description": "Individual adventures, key/value map",
"additionalProperties": { "$ref": "#/definitions/Adventure" }
}
},
"required": ["adventures"],
"definitions": {
"Adventure": {
"type": "object",
Expand Down
40 changes: 20 additions & 20 deletions content/slides/id.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ levels:
{print} What song would you like to hear?
{ask} I like that song too!
{print} Next up... {echo}
debug: true
debug: 'True'
13:
header: Let the programming fun begin!
text: Enjoy the adventures in level 1!
Expand Down Expand Up @@ -199,7 +199,7 @@ levels:
{print} I'll go get your donut. {sleep}
{print} Here you go! A filling donut with toping!
{ask} Have a nice day!
debug: true
debug: 'True'
8:
header: Biarkan kesenangan pemrograman dimulai!
text: Nikmati petualangan di level 2!
Expand Down Expand Up @@ -264,7 +264,7 @@ levels:
{print} or do you prefer... second_choice {at} {random}
{remove} second_choice {to} music_genres
{print} I like music_genre {random} best!
debug: true
debug: 'True'
8:
header: Ayo mulai bekerja!
text: Nikmati petualangan di level 3!
Expand All @@ -279,7 +279,7 @@ levels:
code: |-
name {is} Sophie
{print} My name is name
debug: true
debug: 'True'
3:
header: Memperbaikinya dengan tanda kutip
text: |-
Expand Down Expand Up @@ -317,7 +317,7 @@ levels:
Silakan coba mencetak kontraksi seperti "Anda" atau "Saya" pada layar di bawah ini dan lihat apa yang terjadi....
code: '{print} ''This won''t work!'''
debug: true
debug: 'True'
9:
header: Jelas
text: |-
Expand Down Expand Up @@ -346,7 +346,7 @@ levels:
colors {is} 'orange, silver, white, brown'
{print} 'I love the colors {at} {random} one!'
choice {is} {ask} Which one do you like?
debug: true
debug: 'True'
11:
header: Siap, Bersiap, Ayo!
text: Nikmati petualangan di level 4!
Expand All @@ -367,7 +367,7 @@ levels:
header: Jangan lupa untuk mencetak
text: Saat menggunakan perintah `{if}`, jangan lupa untuk menggunakan perintah `{print}`.
code: '{if} name {is} Hedy ''nice'''
debug: true
debug: 'True'
4:
header: pula
text: |-
Expand Down Expand Up @@ -413,7 +413,7 @@ levels:
item_to_declare {is} {ask} 'What would you like to declare'
{else} Alright
{print} Thank you. Please head to gate A22.'
debug: true
debug: 'True'
8:
header: Ayo pergi!
text: Nikmati petualangan di level 5!
Expand Down Expand Up @@ -478,7 +478,7 @@ levels:
{if} day {is} monday
total_price = total_price * 0.25
{print} 'That will be total_price please'
debug: true
debug: 'True'
10:
header: Ayo mulai bekerja!
text: Nikmati petualangan di level 6!
Expand All @@ -501,7 +501,7 @@ levels:
header: Jangan lupa print command
text: Saat menggunakan perintah ulangi, jangan lupa perintah `{print}`.
code: '{repeat} 5 {times} ''Help!'''
debug: true
debug: 'True'
4:
header: Ulangi perintah tanya
text: Anda juga dapat mengulangi perintah `{ask}`, `{if}`, atau `{else}` beberapa kali.
Expand All @@ -523,7 +523,7 @@ levels:
{if} yes
{print} 'Hurray!
{else} 'That's a shame... Oh well... time to build a shelter and find some food.'
debug: true
debug: 'True'
6:
header: Siap Berangkat!
text: Nikmati petualangan di level 7!
Expand All @@ -538,7 +538,7 @@ levels:
Anda hanya dapat mengulang satu baris kode.
code: '{repeat} 5 {times} {print} ''Help!'''
debug: true
debug: 'True'
3:
header: '{repeat} perintah sebelumnya'
text: |-
Expand Down Expand Up @@ -567,7 +567,7 @@ levels:
code: |-
{if} name {is} Hedy {print} 'nice'
{else} {print} 'boo!'
debug: true
debug: 'True'
6:
header: jika dan yang lain sekarang
text: |-
Expand Down Expand Up @@ -610,7 +610,7 @@ levels:
{print} You chose a round trip ticket'
price * 2
{print} 'That will be ' price ' euros please'
debug: true
debug: 'True'
10:
header: Mari kita lihat petualangannya!
text: Nikmati petualangan di level 8!
Expand Down Expand Up @@ -702,7 +702,7 @@ levels:
{else}
{print} 'Fun!'
{print} 'Thanks for filling in the safety questions everyone. Enjoy your jump!'
debug: true
debug: 'True'
9:
header: Ayo pergi!
text: Nikmati petualangan di level 9!
Expand Down Expand Up @@ -739,7 +739,7 @@ levels:
{add} chosen_person {from} people
{print} 'Come and watch our show tonight!'
{print} 'Tickets are only available at the counter
debug: true
debug: 'True'
5:
header: Saatnya memprogram!
text: Nikmati petualangan di level 10!
Expand Down Expand Up @@ -776,7 +776,7 @@ levels:
{repeat} {for} numbers {in} {range} 1 {to} 10 {times}
{print} This is the table of multiplications for factor
{print} number ' x ' factor ' = ' i * factor
debug: true
debug: 'True'
5:
header: Mari kita mulai pemrograman!
text: Nikmati petualangan di level 11!
Expand Down Expand Up @@ -841,7 +841,7 @@ levels:
{call} new member
{else}
password = {ask} 'Please enter password'
debug: true
debug: 'True'
8:
header: Siap mencobanya?
text: Nikmati petualangan di level 12!
Expand Down Expand Up @@ -951,7 +951,7 @@ levels:
{call} happiness {with} person
{else} mood = sad
{define} sadness {to} name
debug: true
debug: 'True'
9:
header: Ayo!
text: Nikmati petualangan di level 13!
Expand Down Expand Up @@ -1033,7 +1033,7 @@ levels:
{print} 'Shame.. I wont buy it'
{else}
{print} 'I will buy it! Thank you!'
debug: true
debug: 'True'
7:
header: Ayo mulai bekerja!
text: Nikmati petualangan di level 14!
Expand Down
50 changes: 50 additions & 0 deletions content/slides/slides.schema.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
{
"title": "JSON Schema for Hedy Slides",
"type": "object",
"additionalProperties": false,
"properties": {
"levels": {
"type": "object",
"description": "Levels with exercise for the Hedy Slides",
"additionalProperties": {
"$ref": "#/definitions/Level"
}
}
},
"required": [
"levels"
],
"definitions": {
"Level": {
"type": "object",
"additionalProperties": {
"$ref": "#/definitions/Exercise"
}
},
"Exercise": {
"type": "object",
"properties": {
"header": {
"type": "string"
},
"text": {
"type": "string"
},
"editor": {
"type": "string"
},
"code": {
"type": "string"
},
"debug": {
"type": "string"
}
},
"required": [
"header",
"text"
],
"additionalProperties": false
}
}
}
1 change: 1 addition & 0 deletions content/tutorials/tutorials.schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
"intro": { "$ref": "#/definitions/Tutorial" },
"teacher": { "$ref": "#/definitions/Tutorial" }
},
"required": ["intro", "teacher"],
"definitions": {
"Tutorial": {
"type": "object",
Expand Down
19 changes: 19 additions & 0 deletions dodo.py
Original file line number Diff line number Diff line change
Expand Up @@ -318,6 +318,24 @@ def task_extract():
)


def task_generate_optional_yaml_schemas():
"""
Generate yaml schemas with all fields optional
"""
schemas = glob('content/*/*.schema.json')

return dict(
title=lambda _: 'Generate optional yaml schemas',
file_dep=[
'tools/generate-yaml-schemas.py',
*schemas
],
actions=[
[python3, 'tools/generate-yaml-schemas.py']
]
)


def task_devserver():
"""Run a copy of the development server.
Expand Down Expand Up @@ -372,6 +390,7 @@ def task_backend():
return dict(
actions=None,
task_dep=[
'generate_optional_yaml_schemas',
'compile_babel',
'generate_static_babel_content',
'lark',
Expand Down
2 changes: 1 addition & 1 deletion hedy.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
from functools import lru_cache

import lark
from flask_babel import gettext
from website.flask_helpers import gettext_with_fallback as gettext
from lark import Lark
from lark.exceptions import UnexpectedEOF, UnexpectedCharacters, VisitError
from lark import Tree, Transformer, visitors, v_args
Expand Down
2 changes: 1 addition & 1 deletion hedy_error.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import hedy
import hedy_translation
import re
from flask_babel import gettext
from website.flask_helpers import gettext_with_fallback as gettext


# TODO: we should not maintain a list like this. Translation of exception arguments should happen when the exception
Expand Down
Loading

0 comments on commit 5a9b70d

Please sign in to comment.