From 46fb95c817db7a03e07055a5d6c163d08323dd0a Mon Sep 17 00:00:00 2001 From: "M. Neumayr" Date: Wed, 1 Aug 2018 20:09:22 +0200 Subject: [PATCH] add DeepL Pro artificial intelligence translation service (#294) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * add [DeepL Pro](https://www.deepl.com/pro) artificial intelligence translation service ✨ * DeepL trains artificial intelligence to understand and translate texts. * 🇬🇧 🇺🇸 🇩🇪 🇫🇷 🇪🇸 🇮🇹 🇳🇱 🇵🇱 * [TechCrunch: DeepL schools other online translators with clever machine learning](https://techcrunch.com/2017/08/29/deepl-schools-other-online-translators-with-clever-machine-learning/) * use [`deepl-rb`](https://github.com/wikiti/deepl-rb) gem for query the API 💎 * Kudos to @wikiti for helping with the latest `v2.1.0` release that adds `ignore_tags` 💪 --- CHANGES.md | 2 + README.md | 31 +++++- config/locales/en.yml | 11 +- config/locales/ru.yml | 11 +- i18n-tasks.gemspec | 1 + lib/i18n/tasks/base_task.rb | 2 + lib/i18n/tasks/command/commands/missing.rb | 9 +- lib/i18n/tasks/command/options/locales.rb | 8 ++ lib/i18n/tasks/configuration.rb | 3 +- lib/i18n/tasks/deepl_translation.rb | 124 +++++++++++++++++++++ lib/i18n/tasks/google_translation.rb | 27 +++-- spec/deepl_translate_spec.rb | 67 +++++++++++ templates/config/i18n-tasks.yml | 8 +- 13 files changed, 279 insertions(+), 25 deletions(-) create mode 100644 lib/i18n/tasks/deepl_translation.rb create mode 100644 spec/deepl_translate_spec.rb diff --git a/CHANGES.md b/CHANGES.md index 96115897..53666cdd 100644 --- a/CHANGES.md +++ b/CHANGES.md @@ -1,3 +1,5 @@ +Add [DeepL Pro](https://www.deepl.com/pro) AI Translation service. + ## v0.9.21 Relaxes the `rainbow` dependency version restriction. diff --git a/README.md b/README.md index 9f134d3b..092298a2 100644 --- a/README.md +++ b/README.md @@ -7,7 +7,7 @@ i18n-tasks helps you find and manage missing and unused translations. This gem analyses code statically for key usages, such as `I18n.t('some.key')`, in order to: * Report keys that are missing or unused. -* Pre-fill missing keys, optionally from Google Translate. +* Pre-fill missing keys, optionally from Google Translate or DeepL Pro. * Remove unused keys. Thus addressing the two main problems of [i18n gem][i18n-gem] design: @@ -83,7 +83,7 @@ Usage: i18n-tasks add-missing [options] [locale ...] ### Google Translate missing keys -Translate missing values with Google Translate ([more below on the API key](#translation-config)). +Translate missing values with Google Translate ([more below on the API key](#google-translation-config)). ```console $ i18n-tasks translate-missing @@ -91,6 +91,16 @@ $ i18n-tasks translate-missing $ i18n-tasks translate-missing --from base es fr ``` +### DeepL Pro Translate missing keys + +Translate missing values with DeepL Pro Translate ([more below on the API key](#deepl-translation-config)). + +```console +$ i18n-tasks translate-missing +# accepts from and locales options: +$ i18n-tasks translate-missing --backend deepl --from en +``` + ### Find usages See where the keys are used with `i18n-tasks find`: @@ -295,7 +305,7 @@ data: - 'config/locales/%{locale}.yml' ``` -If you want to have i18n-tasks reorganize your existing keys using `data.write`, either set the router to +If you want to have i18n-tasks reorganize your existing keys using `data.write`, either set the router to `pattern_router` as above, or run `i18n-tasks normalize -p` (forcing the use of the pattern router for that run). ##### Key pattern syntax @@ -340,7 +350,7 @@ For more complex cases, you can implement a [custom scanner][custom-scanner-docs See the [config file][config] to find out more. - + ### Google Translate `i18n-tasks translate-missing` requires a Google Translate API key, get it at [Google API Console](https://code.google.com/apis/console). @@ -357,7 +367,18 @@ Put the key in `GOOGLE_TRANSLATE_API_KEY` environment variable or in the config ```yaml # config/i18n-tasks.yml translation: - api_key: + google_translate_api_key: +``` + + +### DeepL Pro Translate + +`i18n-tasks translate-missing` requires a DeepL Pro API key, get it at [DeepL](https://www.deepl.com/pro). + +```yaml +# config/i18n-tasks.yml +translation: + deepl_api_key: ``` ## Interactive console diff --git a/config/locales/en.yml b/config/locales/en.yml index ef00fc17..c257278a 100644 --- a/config/locales/en.yml +++ b/config/locales/en.yml @@ -26,6 +26,7 @@ en: strict: >- Avoid inferring dynamic key usages such as t("cats.#{cat}.name"). Takes precedence over the config setting if set. + translation_backend: Translation backend (google or deepl) value: >- Value. Interpolates: %{value}, %{human_key}, %{key}, %{default}, %{value_or_human_key}, %{value_or_default_or_human_key} @@ -47,7 +48,7 @@ en: normalize: 'normalize translation data: sort and move to the right files' remove_unused: remove unused keys rm: remove the keys in locale data that match the given pattern - translate_missing: translate missing keys with Google Translate + translate_missing: translate missing keys with Google Translate or DeepL Pro tree_convert: convert tree between formats tree_filter: filter tree by key pattern tree_merge: merge trees @@ -89,10 +90,16 @@ en: has %{key_count} keys in total. On average, values are %{value_chars_avg} characters long, keys have %{key_segments_avg} segments. title: Forest (%{locales}) + deepl_translate: + errors: + no_api_key: >- + Setup DeepL Pro API key via DEEPL_AUTH_KEY environment variable or translation.deepl_api_key + in config/i18n-tasks.yml. Get the key at https://www.deepl.com/pro. + no_results: DeepL returned no results. google_translate: errors: no_api_key: >- - Set Google API key via GOOGLE_TRANSLATE_API_KEY environment variable or translation.api_key + Set Google API key via GOOGLE_TRANSLATE_API_KEY environment variable or translation.google_translate_api_key in config/i18n-tasks.yml. Get the key at https://code.google.com/apis/console. no_results: >- Google Translate returned no results. Make sure billing information is set at https://code.google.com/apis/console. diff --git a/config/locales/ru.yml b/config/locales/ru.yml index 7026653c..8a78d179 100644 --- a/config/locales/ru.yml +++ b/config/locales/ru.yml @@ -23,6 +23,7 @@ ru: out_format: 'Формат вывода: %{valid_text}. %{default_text}.' pattern_router: 'Использовать pattern_router: ключи распределятся по файлам согласно data.write' strict: Не угадывать динамические использования ключей, например `t("category.#{category.key}")` + translation_backend: Перевод backend (google или deepl) value: >- Значение, интерполируется с %{value}, %{human_key}, %{key}, %{default}, %{value_or_human_key}, %{value_or_default_or_human_key} @@ -44,7 +45,7 @@ ru: normalize: нормализовать файлы переводов (сортировка и распределение) remove_unused: удалить неиспользуемые ключи rm: удалить ключи, которые соответствуют заданному шаблону - translate_missing: перевести недостающие переводы с Google Translate + translate_missing: перевести недостающие переводы с Google Translate / DeepL Pro tree_convert: преобразовать дерево между форматами tree_filter: фильтровать дерево по ключу tree_merge: объединенить деревья @@ -85,10 +86,16 @@ ru: text_single_locale: >- %{key_count} ключей. В среднем, длина строки: %{value_chars_avg}, сегменты ключей: %{key_segments_avg}. title: 'Данные (%{locales}):' + deepl_translate: + errors: + no_api_key: >- + Задайте ключ API DeepL через переменную окружения DEEPL_AUTH_KEY или translation.deepl_api_key + Получите ключ через https://www.deepl.com/pro. + no_results: DeepL не дал результатов. google_translate: errors: no_api_key: >- - Задайте ключ API Google через переменную окружения GOOGLE_TRANSLATE_API_KEY или translation.api_key + Задайте ключ API Google через переменную окружения GOOGLE_TRANSLATE_API_KEY или translation.google_translate_api_key в config/i18n-tasks.yml. Получите ключ через https://code.google.com/apis/console. no_results: >- Google Translate не дал результатов. Убедитесь в том, что платежная информация добавлена diff --git a/i18n-tasks.gemspec b/i18n-tasks.gemspec index 978dc93f..5451f0c0 100644 --- a/i18n-tasks.gemspec +++ b/i18n-tasks.gemspec @@ -37,6 +37,7 @@ TEXT s.add_dependency 'activesupport', '>= 4.0.2' s.add_dependency 'ast', '>= 2.1.0' + s.add_dependency 'deepl-rb', '>= 2.1.0' s.add_dependency 'easy_translate', '>= 0.5.1' s.add_dependency 'erubi' s.add_dependency 'highline', '>= 1.7.3' diff --git a/lib/i18n/tasks/base_task.rb b/lib/i18n/tasks/base_task.rb index 6d9317c8..5171d9bd 100644 --- a/lib/i18n/tasks/base_task.rb +++ b/lib/i18n/tasks/base_task.rb @@ -11,6 +11,7 @@ require 'i18n/tasks/ignore_keys' require 'i18n/tasks/missing_keys' require 'i18n/tasks/unused_keys' +require 'i18n/tasks/deepl_translation' require 'i18n/tasks/google_translation' require 'i18n/tasks/locale_pathname' require 'i18n/tasks/locale_list' @@ -31,6 +32,7 @@ class BaseTask include IgnoreKeys include MissingKeys include UnusedKeys + include DeeplTranslation include GoogleTranslation include Logging include Configuration diff --git a/lib/i18n/tasks/command/commands/missing.rb b/lib/i18n/tasks/command/commands/missing.rb index fd3721e4..61692a87 100644 --- a/lib/i18n/tasks/command/commands/missing.rb +++ b/lib/i18n/tasks/command/commands/missing.rb @@ -36,11 +36,16 @@ def missing(opt = {}) cmd :translate_missing, pos: '[locale ...]', desc: t('i18n_tasks.cmd.desc.translate_missing'), - args: [:locales, :locale_to_translate_from, arg(:out_format).from(1)] + args: [:locales, :locale_to_translate_from, arg(:out_format).from(1), :translation_backend] def translate_missing(opt = {}) missing = i18n.missing_diff_forest opt[:locales], opt[:from] - translated = i18n.google_translate_forest missing, opt[:from] + translated = case opt[:backend] + when 'deepl' + i18n.deepl_translate_forest missing, opt[:from] + when 'google' + i18n.google_translate_forest missing, opt[:from] + end i18n.data.merge! translated log_stderr t('i18n_tasks.translate_missing.translated', count: translated.leaves.count) print_forest translated, opt diff --git a/lib/i18n/tasks/command/options/locales.rb b/lib/i18n/tasks/command/options/locales.rb index 63c7a60e..5a512924 100644 --- a/lib/i18n/tasks/command/options/locales.rb +++ b/lib/i18n/tasks/command/options/locales.rb @@ -30,6 +30,14 @@ module Locales t('i18n_tasks.cmd.args.desc.locale_to_translate_from'), parser: OptionParsers::Locale::Parser, default: 'base' + + TRANSLATION_BACKENDS = %w[google deepl].freeze + arg :translation_backend, + '-b', + '--backend BACKEND', + t('i18n_tasks.cmd.args.desc.translation_backend'), + parser: OptionParsers::Locale::Parser, + default: TRANSLATION_BACKENDS[0] end end end diff --git a/lib/i18n/tasks/configuration.rb b/lib/i18n/tasks/configuration.rb index 20903198..08b7e552 100644 --- a/lib/i18n/tasks/configuration.rb +++ b/lib/i18n/tasks/configuration.rb @@ -57,7 +57,8 @@ def data_config def translation_config @config_sections[:translation] ||= begin conf = (config[:translation] || {}).with_indifferent_access - conf[:api_key] ||= ENV['GOOGLE_TRANSLATE_API_KEY'] if ENV.key?('GOOGLE_TRANSLATE_API_KEY') + conf[:google_translate_api_key] ||= ENV['GOOGLE_TRANSLATE_API_KEY'] if ENV.key?('GOOGLE_TRANSLATE_API_KEY') + conf[:deepl_api_key] ||= ENV['DEEPL_AUTH_KEY'] if ENV.key?('DEEPL_AUTH_KEY') conf end end diff --git a/lib/i18n/tasks/deepl_translation.rb b/lib/i18n/tasks/deepl_translation.rb new file mode 100644 index 00000000..9b6b3ad6 --- /dev/null +++ b/lib/i18n/tasks/deepl_translation.rb @@ -0,0 +1,124 @@ +# frozen_string_literal: true + +require 'deepl' +require 'i18n/tasks/html_keys' + +module I18n::Tasks + module DeeplTranslation + # @param [I18n::Tasks::Tree::Siblings] forest to translate to the locales of its root nodes + # @param [String] from locale + # @return [I18n::Tasks::Tree::Siblings] translated forest + def deepl_translate_forest(forest, from) + forest.inject empty_forest do |result, root| + translated = translate_list(root.key_values(root: true), to: root.key, from: from) + result.merge! Data::Tree::Siblings.from_flat_pairs(translated) + end + end + + # @param [Array<[String, Object]>] list of key-value pairs + # @return [Array<[String, Object]>] translated list + def translate_list(list, opts) # rubocop:disable Metrics/AbcSize + return [] if list.empty? + opts = opts.dup + opts[:key] ||= translation_config[:deepl_api_key] + validate_translate_api_key! opts[:key] + key_pos = list.each_with_index.inject({}) { |idx, ((k, _v), i)| idx.update(k => i) } + # copy reference keys as is, instead of translating + reference_key_vals = list.select { |_k, v| v.is_a? Symbol } || [] + list -= reference_key_vals + result = list.group_by { |k_v| html_key? k_v[0], opts[:from] }.map do |is_html, list_slice| + fetch_translations list_slice, opts.merge(is_html ? { tag_handling: 'xml' } : { preserve_formatting: true }) + end.reduce(:+) || [] + result.concat(reference_key_vals) + result.sort! { |a, b| key_pos[a[0]] <=> key_pos[b[0]] } + result + end + + # @param [Array<[String, Object]>] list of key-value pairs + # @return [Array<[String, Object]>] translated list + def fetch_translations(list, opts) + options = { + ignore_tags: %w[i18n] + }.merge(opts) + deepl_from_values(list, DeepL.translate(deepl_to_values(list), opts[:from], opts[:to], options)).tap do |result| + fail CommandError, I18n.t('i18n_tasks.deepl_translate.errors.no_results') if result.blank? + end + end + + private + + def validate_translate_api_key!(key) + fail CommandError, I18n.t('i18n_tasks.deepl_translate.errors.no_api_key') if key.blank? + DeepL.configure do |config| + config.auth_key = key + end + end + + # @param [Array<[String, Object]>] list of key-value pairs + # @return [Array] values for translation extracted from list + def deepl_to_values(list) + list.map { |l| deepl_dump_value l[1] }.flatten.compact + end + + # @param [Array<[String, Object]>] list + # @param [Array] translated_values + # @return [Array<[String, Object]>] translated key-value pairs + def deepl_from_values(list, translated_values) + keys = list.map(&:first) + untranslated_values = list.map(&:last) + translated_values = Array(translated_values).map(&:text) + keys.zip deepl_parse_value(untranslated_values, translated_values.to_enum) + end + + # Prepare value for translation. + # @return [String, Array, nil] value for DeepL Translate or nil for non-string values + def deepl_dump_value(value) + case value + when Array + # dump recursively + value.map { |v| deepl_dump_value v } + when String + deepl_replace_interpolations value + end + end + + # Parse translated value from the each_translated enumerator + # @param [Object] untranslated + # @param [Enumerator] each_translated + # @return [Object] final translated value + def deepl_parse_value(untranslated, each_translated) + case untranslated + when Array + # implode array + untranslated.map { |from| deepl_parse_value(from, each_translated) } + when String + deepl_restore_interpolations untranslated, each_translated.next + else + untranslated + end + end + + INTERPOLATION_KEY_RE = /(%\{[^}]+})/ + + # @param [String] value + # @return [String] 'hello, %{name}' => 'hello, %{name}' + def deepl_replace_interpolations(value) + value.gsub(INTERPOLATION_KEY_RE, '\1') + end + + # @param [String] untranslated + # @param [String] translated + # @return [String] 'hello, %{name}' => 'hello, %{name}' + def deepl_restore_interpolations(untranslated, translated) + return translated if untranslated !~ INTERPOLATION_KEY_RE + translated.gsub(%r{<\/?i18n>}, '') + rescue StandardError => e + raise CommandError.new(e, <<-TEXT.strip) +Error when restoring interpolations: + original: "#{untranslated}" + response: "#{translated}" + error: #{e.message} (#{e.class.name}) + TEXT + end + end +end diff --git a/lib/i18n/tasks/google_translation.rb b/lib/i18n/tasks/google_translation.rb index 545f739c..0adfaf26 100644 --- a/lib/i18n/tasks/google_translation.rb +++ b/lib/i18n/tasks/google_translation.rb @@ -22,7 +22,12 @@ def google_translate_forest(forest, from) def google_translate_list(list, opts) # rubocop:disable Metrics/AbcSize return [] if list.empty? opts = opts.dup - opts[:key] ||= translation_config[:api_key] + opts[:key] ||= translation_config[:google_translate_api_key] + # fallback with deprecation warning + if translation_config[:api_key] + warn_deprecated 'Please rename Google Translate API Key from `api_key` to `google_translate_api_key`.' + opts[:key] ||= translation_config[:api_key] + end validate_google_translate_api_key! opts[:key] key_pos = list.each_with_index.inject({}) { |idx, ((k, _v), i)| idx.update(k => i) } # copy reference keys as is, instead of translating @@ -51,7 +56,7 @@ def fetch_google_translations(list, opts) opts[key] = language_code.split('-').first end - from_values(list, EasyTranslate.translate(to_values(list), opts)).tap do |result| + google_from_values(list, EasyTranslate.translate(google_to_values(list), opts)).tap do |result| fail CommandError, I18n.t('i18n_tasks.google_translate.errors.no_results') if result.blank? end end @@ -64,17 +69,17 @@ def validate_google_translate_api_key!(key) # @param [Array<[String, Object]>] list of key-value pairs # @return [Array] values for translation extracted from list - def to_values(list) + def google_to_values(list) list.map { |l| dump_value l[1] }.flatten.compact end # @param [Array<[String, Object]>] list # @param [Array] translated_values # @return [Array<[String, Object]>] translated key-value pairs - def from_values(list, translated_values) + def google_from_values(list, translated_values) keys = list.map(&:first) untranslated_values = list.map(&:last) - keys.zip parse_value(untranslated_values, translated_values.to_enum) + keys.zip google_parse_value(untranslated_values, translated_values.to_enum) end # Prepare value for translation. @@ -85,7 +90,7 @@ def dump_value(value) # dump recursively value.map { |v| dump_value v } when String - replace_interpolations value + google_replace_interpolations value end end @@ -93,13 +98,13 @@ def dump_value(value) # @param [Object] untranslated # @param [Enumerator] each_translated # @return [Object] final translated value - def parse_value(untranslated, each_translated) + def google_parse_value(untranslated, each_translated) case untranslated when Array # implode array - untranslated.map { |from| parse_value(from, each_translated) } + untranslated.map { |from| google_parse_value(from, each_translated) } when String - restore_interpolations untranslated, each_translated.next + google_restore_interpolations untranslated, each_translated.next else untranslated end @@ -110,7 +115,7 @@ def parse_value(untranslated, each_translated) # @param [String] value # @return [String] 'hello, %{name}' => 'hello, ' - def replace_interpolations(value) + def google_replace_interpolations(value) i = -1 value.gsub INTERPOLATION_KEY_RE do i += 1 @@ -121,7 +126,7 @@ def replace_interpolations(value) # @param [String] untranslated # @param [String] translated # @return [String] 'hello, ' => 'hello, %{name}' - def restore_interpolations(untranslated, translated) + def google_restore_interpolations(untranslated, translated) return translated if untranslated !~ INTERPOLATION_KEY_RE values = untranslated.scan(INTERPOLATION_KEY_RE) translated.gsub(/#{Regexp.escape(UNTRANSLATABLE_STRING)}\d+/i) do |m| diff --git a/spec/deepl_translate_spec.rb b/spec/deepl_translate_spec.rb new file mode 100644 index 00000000..c88201ca --- /dev/null +++ b/spec/deepl_translate_spec.rb @@ -0,0 +1,67 @@ +# frozen_string_literal: true + +require 'spec_helper' +require 'i18n/tasks/commands' +require 'deepl' + +RSpec.describe 'DeepL Translation' do + nil_value_test = ['nil-value-key', nil, nil] + text_test = ['key', "Hello, %{user} O'Neill! How are you?", "¡Hola, %{user} O'Neill! Como estas?"] + html_test = ['html-key.html', "Hello, %{user} big O'neill ❤︎", "Hola, %{user} gran O'neill ❤︎"] + html_test_plrl = ['html-key.html.one', 'Hello %{count}', 'Hola %{count}'] + array_test = ['array-key', ['Hello.', nil, '', 'Goodbye.'], ['Hola.', nil, '', 'Adiós.']] + fixnum_test = ['numeric-key', 1, 1] + ref_key_test = ['ref-key', :reference, :reference] + + describe 'real world test' do + delegate :i18n_task, :in_test_app_dir, :run_cmd, to: :TestCodebase + + before do + TestCodebase.setup('config/locales/en.yml' => '', 'config/locales/es.yml' => '') + end + + after do + TestCodebase.teardown + end + + context 'command' do + let(:task) { i18n_task } + + it 'works' do + skip 'temporarily disabled on JRuby due to https://github.com/jruby/jruby/issues/4802' if RUBY_ENGINE == 'jruby' + skip 'DEEPL_AUTH_KEY env var not set' unless ENV['DEEPL_AUTH_KEY'] + in_test_app_dir do + task.data[:en] = build_tree('en' => { + 'common' => { + 'a' => 'λ', + 'hello' => text_test[1], + 'hello_html' => html_test[1], + 'hello_plural_html' => { + 'one' => html_test_plrl[1] + }, + 'array_key' => array_test[1], + 'nil-value-key' => nil_value_test[1], + 'fixnum-key' => fixnum_test[1], + 'ref-key' => ref_key_test[1] + } + }) + task.data[:es] = build_tree('es' => { + 'common' => { + 'a' => 'λ' + } + }) + + run_cmd 'translate-missing-deepl' + expect(task.t('common.hello', 'es')).to eq(text_test[2]) + expect(task.t('common.hello_html', 'es')).to eq(html_test[2]) + expect(task.t('common.hello_plural_html.one', 'es')).to eq(html_test_plrl[2]) + expect(task.t('common.array_key', 'es')).to eq(array_test[2]) + expect(task.t('common.nil-value-key', 'es')).to eq(nil_value_test[2]) + expect(task.t('common.fixnum-key', 'es')).to eq(fixnum_test[2]) + expect(task.t('common.ref-key', 'es')).to eq(ref_key_test[2]) + expect(task.t('common.a', 'es')).to eq('λ') + end + end + end + end +end diff --git a/templates/config/i18n-tasks.yml b/templates/config/i18n-tasks.yml index a5248346..2e3aeaac 100644 --- a/templates/config/i18n-tasks.yml +++ b/templates/config/i18n-tasks.yml @@ -82,10 +82,14 @@ search: ## The options specified above are passed down to each scanner. Per-scanner options can be specified as well. ## See this example of a custom scanner: https://github.com/glebm/i18n-tasks/wiki/A-custom-scanner-example -## Google Translate +## Translation Services # translation: +# # Google Translate # # Get an API key and set billing info at https://code.google.com/apis/console to use Google Translate -# api_key: "AbC-dEf5" +# google_translate_api_key: "AbC-dEf5" +# # DeepL Pro Translate +# # Get an API key and subscription at https://www.deepl.com/pro to use DeepL Pro +# deepl_api_key: "48E92789-57A3-466A-9959-1A1A1A1A1A1A" ## Do not consider these keys missing: # ignore_missing: