-
Notifications
You must be signed in to change notification settings - Fork 13
Translation Workflow
1. panicbutton.io (jekyll site)
2. PanicButton (Android app)
[Status: Existing in the panicbutton.io repository]
The markdown files in the panicbutton.io Jekyll site's _posts directory serve as the source files for the translation process. Authors use Prose to edit the markdown source files, in English, triggering a rebuild of the site on update of a markdown post.
Files that move from the panicbutton.io repository to the PanicButton Android app through the Transifex translation process are located in _posts/mobile and _posts/help. These relevant markdown files contain the yaml "categories" assignment "mobile" or "help".
[Status: script completed in panicbutton.io/_locales/lib/tx_md2jsonKV.py]
Because the markdown files for panicbutton.io contain extensive amounts of yaml front matter, it is necessary to convert the files to JSON prior to sending them to Transifex. Transifex does not handle the mixed yaml/markdown source well. A push to the panicbutton.io repository triggers travis.yaml to launch the bash script, tx_md2jsonKV.py, which converts all markdown files in _posts into JSON Key-Value pairs comprising key-value pairs. tx_md2jsonKV.py builds these new JSON Key-Value files in the _locales directory.
In this conversion process, there is one JSON file for each language, with each master JSON file containing yaml front-matter and content of that language's markdown files. Each item of yaml front-matter represents a new key, and the markdown content also comprises one key, "content". For example, if the folder _posts/mobile/ has ten markdown files, the resulting JSON file, mobile.json, will comprise all ten of the markdown files, each broken down into its yaml and markdown key-value pairs.
To run tx_md2jsonKV.py
, navigate into _locales/lib
and python tx_md2jsonKV.py
. Dependencies: JQ.
tx_md2jsonKV.py
follows several steps, progressing as follows:
All translated markdown files, when returned from Transifex are contained in language-specific directories within the _posts
directory. Because we only want to translate the English language files within _posts
, we must exclude directories of translated files from the JSON KV generation process. tx_md2jsonKV.py
first creates an array of the existing language directories, taken from config.py
.
Note: The language array in config.py
is, at present, manually populated. In the future, it should be populated using some type of link to the panicbutton.io Transifex interface, which lists all available languages for translation. This is to be addressed in future iterations of the script.
####Processing each markdown file
When it is run, tx_md2jsonKV.py
walks through all directories in _posts
not containing translated content, performing the following steps on each file:
-
First,
tx_md2jsonKV.py
locates the second "---" YAML demarcation, splitting the markdown "content" of the file from the YAML front-matter. The script writes the markdown text to_locales/lib/temp/{FILENAME}.md.txt
-
Next, the script opens the original markdown file again, this time to harvest the YAML front-matter. The script takes everything above the second "---" YAML demarcation. The script assigns each piece of YAML front matter a unique key, which will be used to retain the original order of the YAML front-matter at the close of the Transifex process. This key also serves as an ID that allows for easy replacement of the original, untranslated YAML front-matter by the returned, translated JSON content.
The key is attached at the beginning of each front-matter item, using the formation K{NUM}-{FRONTMATTERITEM}
.
Each file's YAML front matter, with the appended key, is written to the directory _locales/lib/temp/{FILENAME}.md.yml
.
-
Each YAML file in
_locales/lib/temp/{FILENAME}.md.yml
is opened. The YAML front-matter, complete with its keys assigned inStep 2.2
is converted into JSON using the moduleyaml2json.py
, imported intotx_md2jsonKV.py
. The resulting JSON file is written to_locales/lib/temp/{FILENAME}.md.json.orig
. -
Next, each
.json.orig
file in_locales/lib/temp/{FILENAME}.md.json.orig
is fed into the scriptpre-tx-push.jq
. This script uses JQ, a highly efficient command-line JSON processor.
First, the JQ file imports white-listed YAML keys, contained in the file jqconfig.json
. These keys have been added because they are assigned to translatable content in the markdown files. The jqconfig.json
file is manually generated, and it is a TODO to make the file accessible to content editors/site administrators without full backend access, and to automate the process of populating the jqconfig.json
file accordingly.
jqconfig.json
specifies keys
and subkeys
. keys
are YAML front-matter items that are paired directly with translatable content. subkeys
contain arrays of keys, some of which are linked to translatable content.
pre-tx-push.jq
first cycles through each of the keys. If a key is linked to a top-level YAML item, it creates a new JSON KV pair. The key is {FILENAME}---{KEYNAME}
and the value is the content of that key.
Next, pre-tx-push.jq
cycles through each of the subkeys. If the subkey's arrays contain translatable keys, then the script performs a similar key processing step as above, except this time, all subkeys are included in the resulting JSON key, such that {FILENAME}---{SUBKEY-1}---{SUBKEY-2}...---{SUBKEYN}---{KEYNAME}
and the value is content of the key buried inside of the subkeys.
The resulting, flat JSON KV array for each .json.orig
file is written to _locales/lib/temp/
as {FILENAME}.md.json.tmp
.
- Next, each
.txt
file in_locales/lib/temp/{FILENAME}.md.txt
is opened, and its contents stored as a JSON KV pair, with the key following the format{FILENAME}---content
. This new JSON KV pair is appended to the JSON in the corresponding_locales/lib/temp/{FILENAME}.json.tmp
file. The resulting JSON is written as_locales/lib/temp/{FILENAME}.json
.
####Final steps
Once this processing is completed, the contents of the _locales/lib/temp/{FILENAME}.json
files are combined into a single file, the above-mentioned tx_mdtoJSON.en.json
. This is the file that is fed into Transifex in order to spark the translation processes.
Although Transifex accepts nested JSON, per their documentation, we have opted to create a flat JSON KV file in this manner in order to retain more control over the assignment of content keys, which must work with our very specific translation process.
[Status: .tx config creation in progress]
Transifex is hooked into panicbutton.io's root in a .tx folder that contains the Transifex config file (.tx/config). The config file alerts the Transifex client to files in the repository that are subject to translation in Transifex; dictates the naming convention for the returned, translated files; and sets the file format (here, Key-Value JSON).
[Status: TXGH Integration TODO]
If the newly built JSON Key-Value files (in _locales, from tx_md2jsonKV.py) differ from the ones that they replace, TXGH, a Sinatra-based transifex-github integration/server, registers this change to the files in _locales. It updates Transifex to show that each language's repository is no longer 100% translated. This cues translators to update each language's translations.
[Status: TXGH integration TODO, script completed in panicbutton.io repo: _locales/lib/tx_jsonKV2md.py ]
When a translator completes a language's translation, and the Transifex repository registers 100% completion, TXGH pushes the newly translated JSON Key-Value files into panicbutton.io's git repository at _locales/<lang>/, overwriting the files that exist in this repository (the old translations).
This push triggers another bash script prompted by travis.yaml, tx_jsonKV2md.py, which converts the JSON Key-Value files in _locales/<lang>/ back into markdown. The new markdown files are placed in _posts/<lang>/, replacing the markdown files from the previous round of translations.
####1. Running the script
To run the tx_jsonKV2md.py script, navigate into the _locales/lib
directory and run python tx_jsonKV2md.py
####2. What the script does
When translated files are returned from Transifex, they are processed by the tx_jsonKV2md.py
script according to the following steps.
Transifex returns one JSON KV file for each translated language. These files are returned to _locales/{LANG}/
and they are named tx_jsonKV_{LANG}.json
.
Like the JSON KV file that was sent to Transifex, these returned files are flat JSON KV files. Each key
identifies the translated markdown filename and the corresponding YAML front-matter or markdown content that has been translated.
In the case of nested YAML front-matter, the subkeys that contain the translated key are also included in the JSON KV key name. For more information about this, see step ==2.4== in the markdown to JSON KV conversion documentation.
tx_jsonKV2md
cycles through each language in config.py
, performing the following actions for each.
Note: The language array in config.py
is, at present, manually populated. In the future, it should be populated using some type of link to the panicbutton.io Transifex interface, which lists all available languages for translation. This is to be addressed in future iterations of the script.
-
First, it creates a JSON dictionary from the file
_locales/{LANG}/tx_jsonKV_{LANG}.json
. -
Next, it splits this JSON dictionary into objects, each of which represents the translated content of one of the markdown file that were fed into the JSON KV file during the markdown -> JSON KV conversion process.
Each JSON key is added as an attribute of the object.
Nested JSON keys (ie those that are contained within subkeys) are stored as attributes nested inside of objects. There is an object that represents each level of subkey. So, if there is a key, "title" inside of a subkey, "action", inside of a subkey, "checklist," then a "checklist" object will be created and appended to the markdown file object. Another object, "action," will be appeneded to the "checklist" object, "title" will be added as a key to the "action" object. This allows the objects to grow in the tree-like structure that is representative of the YAML front-matter that will eventually result.
Note: the above is very difficult to describe, and I think it could be done with clearer language. To see the process, see function breakfiles
and the helper functions that branch off of it.
Once all attributes have been assigned and sub-objects created, each markdown file's object is placed into a dictionary (jsonObj.obj_dict
), the keys of which are the markdown filename.
-
For each markdown file object in the dictionary of objects
jsonObj.obj_dict
, if the object contains the top-level attribute "content", this attribute represents the markdown text of the file, not a piece of YAML front-matter. In this case, the "content" value is written to a text document,_locales/{LANG}/temp/{FILENAME}.md_translated.txt
-
Each markdown file's object is merged with contents of the "original", untranslated JSON, located in
_locales/lib/temp/{FILENAME}.json.orig
. Translated content replaces untranslated content; this is made possible because the keys of each set of translated / untranslated content are identical.
The resulting JSON dictionary, in which all translated content has replaced the original, English content, is written as JSON to _locales/{LANG}/temp/{FILENAME}.md.trans.json
. Importantly, this file contains only YAML front matter; markdown content, mentioned in step #4 and not included in the .json.orig
files is excluded.
-
Each
.trans.json
file is piped through the command line tooljson2yaml
, converting the json into YAML. This YAML content is written to_locales/{LANG}/temp/{FILENAME}.md.trans.yaml
. -
Finally, the markdown content is created. A markdown file is created at
_posts/{lang}/{FILENAME}.md
. -
The YAML front matter at
_locales/temp/{FILENAME}.trans.yaml
is processed to (a) correctly order the contents, per the alphanumeric key (b) remove the alphanumeric key from the YAML front matter item name (c) adjust the spacing, which is inflated by thejson2yaml
command line tool in step #5. -
The processed YAML front matter is written into the markdown file created at
_posts/{lang}/{FILENAME}.md
, between the necessary YAML demarcations, "---". -
The markdown text --
_locales/{LANG}/temp/{FILENAME}.md_translated.txt
-- is opened and appended to_posts/{LANG}/{FILENAME}.md
, completing the conversion process.
[Status: travis.yaml cloning and pushing to Android application TODO]
On each push to the panicbutton.io dev branch, travis.yaml performs a process to transform and push relevant information to the PanicButton Android application repository. First, travis.yaml git clones
the PanicButton Android application's asset repository. The script then converts all of the markdown files in _posts/mobile and _posts/help (including all files in the _posts/mobile/<lang> and _posts/help/<lang> directories) into custom, PanicButton-specific JSON following templates located in /api/help.json and /api/mobile.json. These files are named mobile_<lang>.json and help_<lang>.json.
Upon creation, the files are copied into the newly cloned PanicButton App assets directory, over-writing the now-outdated, equivalently-named files.
Next, travis.yaml calls git add -A && git commit -m "Updating translations to the PanicButton Application" && git push origin dev
. If the newly created JSON files, mobile_<lang>.json, differ from the previous versions (eg if the git push
that called the script represents an updated translation), then the changes will be committed and pushed to the PanicButton Android Application's repository. This triggers a rebuild of the app. If the JSON files, mobile_<lang>.json, have not changed, then the cloned directory will not register any change on git add -A
, and so the git commit && git push origin dev
will fail without triggering an error.
[Status: .tx config creation in progress. TXGH integration TODO]
The PanicButton Android app contains an additional Transifex workflow. The app has translatable material independent of the panicbutton.io source files in the /src/main/res/values/strings.xml file.
Thus, the PanicButton Android Application repository also houses a .tx folder with its own config file, which dictates the translation source, resulting file names, and format for strings.xml.
A TXGH integration in the PanicButton app repository will register a pushed change in the strings.xml file, updating strings.xml in the Transifex repository to show that translation is no longer complete and alerting translators that an update to the translation is necessary.
Upon completion of the translation, TXGH will return the translated files to the PanicButton repository, placing them in the directories /src/main/res/values-<lang>, with each language's file named strings.xml. Existing translations in each of these directories will be overwritten by this action from TXGH. The app will be rebuilt upon return of new files.