-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unicode MessageFormat 2.0 #1042
Comments
@jyasskin @torgo Checking in on progress. The next version of LDML (which contains MessageFormat) will release shortly. This version will stabilize MF2 (think of it like REC status). Let me know if there is any way we can help. v47 pending (replaces the 46.1 link above): https://www.unicode.org/reports/tr35/dev/tr35-messageFormat.html |
I'll post my review now, in case any of the minor notes can help while finalizing the spec. The soonest we can check whether there's TAG consensus on this is the week of March 17, which might be too late. I appreciate the clarity in the stability guarantees and the design goals. I think the programming model is that, for each message in the code, the code author writes one MessageFormat 2.0 string in their language which seeds a translation database. Then the translator for a given locale writes a whole new MessageFormat 2.0 string representing the translation of that locale. Then at runtime, the code pulls that translated MF2 string out of the database and formats it with the appropriate dynamic values. This model is implied by documentation like https://messageformat.dev/docs/translators/, but AFAIC it's not explicitly stated anywhere, and it probably should be. The specification overall looks solid and well-considered. I think we're more likely to have feedback for the Javascript integration in https://github.com/tc39/proposal-intl-messageformat, although I don't see any immediate problems with that either. Some minor notes on the spec: https://www.unicode.org/reports/tr35/dev/tr35-messageFormat.html#function uses "function" to refer to what I'd call a "function call" in other specs. This is a little inconsistent with https://www.unicode.org/reports/tr35/dev/tr35-messageFormat.html#default-functions, which uses "function" the way I'd expect. "Function handler" also seems to have the meaning I'd associate with "function", but it seems to only be used with user-defined functions. This isn't a big deal, but maybe a clarifying note would fit somewhere. "A message with markup that should not be copied:" uses "@can-copy" to mark the text that should not be copied? I'm concerned about how loose the "resolved value" interface is. Without care, specifications using resolved values will depend on attributes that aren't guaranteed to exist. Perhaps the extra detail in the Intl proposal will insulate the Web from this ambiguity. In option resolution, "If rv is a fallback value: If supported, emit a Bad Option error." doesn't say what to do if the BadOption error isn't supported. "The resolution of markup MUST always succeed.", but it calls option resolution which can fail? |
Thank you very much for this early review! A few (personal) comments:
I think that's an example. Good point tho'.
That appears to be an indentation problem. The step "Set res[id] to be rv" should be executed even if rv is a fallback value.
Note that:
So option resolution cannot fail (but it may emit diagnostic errors). |
こんにちは TAG-さん!
I'm requesting a TAG review of Unicode's MessageFormat 2.0.
Software needs to construct messages that incorporate various pieces of information. The complexities of the world's languages make this challenging. MessageFormat 2 defines the data model, syntax, processing, and conformance requirements for the next generation of dynamic messages. It is intended for adoption by programming languages, software libraries, and software localization tooling. It enables the integration of internationalization APIs (such as date or number formats), and grammatical matching (such as plurals or genders). It is extensible, allowing software developers to create formatting or message selection logic that add on to the core capabilities. Its data model provides a means of representing existing syntaxes, thus enabling gradual adoption by users of older formatting systems. The goal is to allow developers and translators to create natural-sounding, grammatically-correct, user interfaces that can appear in any language and support the needs of diverse cultures.
See also blog which includes links to implementations.
Further details:
You should also know that...
This specification was reviewed somewhat informally at TPAC 2023 (very many significant changes have occurred since).
The text was updated successfully, but these errors were encountered: