Skip to content

Latest commit

 

History

History
190 lines (141 loc) · 12.5 KB

README.md

File metadata and controls

190 lines (141 loc) · 12.5 KB

MessageFormat Working Group

Welcome to the home page for the MessageFormat Working Group, a subgroup of the Unicode CLDR-TC.

Charter

The Message Format Working Group (MFWG) is tasked with developing an industry standard for the representation of localizable message strings to be a successor to ICU MessageFormat. MFWG will recommend how to remove redundancies, make the syntax more usable, and support more complex features, such as gender, inflections, and speech. MFWG will also consider the integration of the new standard with programming environments, including, but not limited to, ICU, DOM, and ECMAScript, and with localization platform interchange. The output of MFWG will be a specification for the new syntax.

MessageFormat 2 Technical Preview

The MessageFormat 2 specification is a new part of the LDML specification. This specification is initially released as a "Tech Preview", which means that the stability policy is not in effect and feedback from users and implementers might result in changes to the syntax, data model, functions, or other normative aspects of MessageFormat 2. Such changes are expected to be minor and, to the extent possible, to be compatible with what is defined in the Tech Preview.

The MFWG welcomes any and all feedback, including bugs reports, implementation reports, success stories, feature requests, requests for clarification, or anything that would be helpful in stabilizing the specification and promoting widespread adoption.

The MFWG specifically requests feedback on the following issues:

  • How best to define value resolution #678
  • How to perform non-integer exact number selection #675
  • Whether markup should support additional spaces #650
  • Whether "attribute-like" behavior is needed and what form it should take #642
  • Whether to relax constraints on complex message start #610
  • Whether omitting the * variant key should be permitted #603

What is MessageFormat 2?

Software needs to construct messages that incorporate various pieces of information. The complexities of the world's languages make this challenging. MessageFormat 2 defines the data model, syntax, processing, and conformance requirements for the next generation of dynamic messages. It is intended for adoption by programming languages, software libraries, and software localization tooling. It enables the integration of internationalization APIs (such as date or number formats), and grammatical matching (such as plurals or genders). It is extensible, allowing software developers to create formatting or message selection logic that add on to the core capabilities. Its data model provides a means of representing existing syntaxes, thus enabling gradual adoption by users of older formatting systems.

The goal is to allow developers and translators to create natural-sounding, grammatically-correct, user interfaces that can appear in any language and support the needs of diverse cultures.

MessageFormat 2 Specification and Syntax

The current specification starts here and may have changed since the publication of the Tech Preview version. The Tech Preview specification is here

The current draft syntax for defining messages can be found in spec/syntax.md. The syntax is formally described in ABNF.

Messages can be simple strings:

Hello, world!

Messages can interpolate arguments:

Hello {$user}!

Messages can transform those arguments using formatting functions. Functions can optionally take options:

Today is {$date :datetime}
Today is {$date :datetime weekday=long}.

Messages can use a selector to choose between different variants, which correspond to the grammatical (or other) requirements of the language:

.input {$count :integer}
.match $count
0   {{You have no notifications.}}
one {{You have {$count} notification.}}
*   {{You have {$count} notifications.}}

Messages can annotate arguments with formatting instructions or assign local values for use in the formatted message:

.input {$date :datetime weekday=long month=medium day=short}
.local $numPigs = {$pigs :integer}
{{On {$date} you had this many pigs: {$numPigs}}}

The message syntax supports using multiple selectors and other features to build complex messages. It is designed so that implementations can extend the set of functions or their options using the same syntax. Implementations may even support users creating their own functions.

See more examples and the formal definition of the grammar in spec/syntax.md.

Developer Documentation

Unofficial documentation for developers on MessageFormat 2 syntax and on using it with various programming languages can be found at messageformat.dev, which also includes an interactive playground for experimenting with message syntax.

Normative Changes during Tech Preview

The Working Group continues to address feedback and develop portions of the specification not completed for the LDML45 Tech Preview release. The main branch of this repository contains changes implemented since the technical preview.

Implementers should be aware of the following normative changes during the tech preview period. See the commit history after 2024-04-13 for a list of all commits (including non-normative changes).

  • #885 Address equality of name and literal values, including requiring keys to use NFC
  • #884 Add support for bidirectional isolates and strong marks in syntax and address UAX31/UTS55 requirements
  • #883 Remove forward-compatibility promise and all reserved/private syntax.
  • #882 Specify bad-option error for bad digit size options in :number and :integer functions
  • #878 Clarify "rule" selection in :number and :integer functions
  • #877 Match on variables instead of expressions.
  • #854 Allow whitespace at complex message start
  • #853 Add a "duplicate-variant" error
  • #845 Define "attributes" feature
  • #834 Modify the stability policy (not currently in effect due to Tech Preview)
  • #816 Refine error handling
  • #815 Removed machine-readable function registry as a deliverable
  • #813 Change default of :date and :datetime date formatting from short to medium
  • #812 Allow trailing whitespace for complex messages
  • #793 Recommend the use of escapes only when necessary
  • #775 Add formal definitions for variable, external variable, and local variable
  • #774 Refactor errors, adding Message Function Errors
  • #771 Remove inappropriate normative statement from errors.md
  • #767 Add a test schema and #778 validate tests against it
  • #775 Add a definition for variable
  • #774 Refactor error types, adding a Message Function Error type (and subtypes)
  • #769 Add :test:function, :test:select and :test:format functions for implementation testing
  • #743 Collapse all escape sequence rules into one (affects the ABNF)

In addition to the above, the test suite is significantly modified and updated.

Implementations

  • Java: com.ibm.icu.message2, part of ICU 75, is a tech preview implementation of the MessageFormat 2 syntax, together with a formatting API. See the ICU User Guide for examples and a quickstart guide.
  • C/C++: icu::message2::MessageFormatter, part of ICU 75, is a tech preview implementation of MessageFormat 2.
  • JavaScript: messageformat 4.0 implements the MessageFormat 2 syntax, together with a polyfill of the runtime API proposed for ECMA-402.

The working group is also aware of these implementations in progress or released, but has not evaluated them:

  • i18next i18nFormat plugin to use mf2 format with i18next, version 0.1.1

Note

Tell us about your MessageFormat 2 implementation! Submit a PR on this page, file an issue, or send email to have your implementation appear here.

Sharing Feedback

Technical Preview Feedback: file an issue here

We invite feedback about the current syntax draft, as well as the real-life use-cases, requirements, tooling, runtime APIs, localization workflows, and other topics.

Participation / Joining the Working Group

We are looking for participation from software developers, localization engineers and others with experience in Internationalization (I18N) and Localization (L10N). If you wish to contribute to this work, please review the information about the Contributor License Agreement below.

To follow this work:

  1. Apply to join our mailing list
  2. Watch this repository (use the "Watch" button in the upper right corner)

To contribute to this work, in addition to the above:

  1. Each individual MUST have a copy of the CLA on file. See below.
  2. Individuals who are employees of Unicode Member organizations SHOULD contact their member representative. Individuals who are not employees of Unicode Member organizations MUST contact the chair to request Invited Expert status. Employees of Unicode Member organizations MAY also apply for Invited Expert status, subject to approval from their member representative.

Copyright & Licenses

Copyright © 2019-2024 Unicode, Inc. Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the United States and other countries.

A CLA is required to contribute to this project - please refer to the CONTRIBUTING.md file (or start a Pull Request) for more information.

The contents of this repository are governed by the Unicode Terms of Use and are released under LICENSE.