This document defines the behaviour of a MessageFormat 2.0 implementation when formatting a message for display in a user interface, or for some later processing.
To start, we presume that a message has either been parsed from its syntax or created from a data model description. If the resulting message is not well-formed, a Syntax Error is emitted. If the resulting message is well-formed but is not valid, a Data Model Error is emitted.
The formatting of a message is defined by the following operations:
-
Pattern Selection determines which of a message's patterns is formatted. For a message with no selectors, this is simple as there is only one pattern. With selectors, this will depend on their resolution.
-
Formatting takes the resolved values of the text and placeholder parts of the selected pattern, and produces the formatted result for the message. Depending on the implementation, this result could be a single concatenated string, an array of objects, an attributed string, or some other locally appropriate data type.
-
Expression and Markup Resolution determines the value of an expression or markup, with reference to the current formatting context. This can include multiple steps, such as looking up the value of a variable and calling formatting functions. The form of the resolved value is implementation defined and the value might not be evaluated or formatted yet. However, it needs to be "formattable", i.e. it contains everything required by the eventual formatting.
The resolution of text is rather straightforward, and is detailed under literal resolution.
Implementations are not required to expose the expression resolution and pattern selection operations to their users, or even use them in their internal processing, as long as the final formatting result is made available to users and the observable behavior of the formatting matches that described here.
Attributes MUST NOT have any effect on the formatted output of a message, nor be made available to function handlers.
Important
This specification does not require either eager or lazy expression resolution of message parts; do not construe any requirement in this document as requiring either.
Implementations are not required to evaluate all parts of a message when parsing, processing, or formatting. In particular, an implementation MAY choose not to evaluate or resolve the value of a given expression until it is actually used by a selection or formatting process. However, when an expression is resolved, it MUST behave as if all preceding declarations affecting variables referenced by that expression have already been evaluated in the order in which the relevant declarations appear in the message. An implementation MUST ensure that every expression in a message is evaluated at most once.
Note
Implementations with lazy evaluation MUST NOT use a call-by-name evaluation strategy. Instead, they must evaluate expressions at most once ("call-by-need"). This is to prevent expressions from having different values when used in different parts of a given message. Function handlers are not necessarily pure: they can access external mutable state such as the current system clock time. Thus, evaluating the same expression more than once could yield different results. That behavior violates this specification.
Important
Implementations and users SHOULD NOT create function handlers that mutate external program state, particularly since such a function handler can present a remote execution hazard.
A message's formatting context represents the data and procedures that are required for the message's expression resolution, pattern selection and formatting.
At a minimum, it includes:
-
Information on the current locale, potentially including a fallback chain of locales. This will be passed on to formatting functions.
-
Information on the base directionality of the message and its text tokens. This will be used by strategies for bidirectional isolation, and can be used to set the base direction of the message upon display.
-
An input mapping of string identifiers to values, defining variable values that are available during variable resolution. This is often determined by a user-provided argument of a formatting function call.
-
The function registry, providing the function handlers of the functions referred to by message functions.
-
Optionally, a fallback string to use for the message if it is not valid.
Implementations MAY include additional fields in their formatting context.
A resolved value is the result of resolving a text, literal, variable, expression, or markup. The resolved value is determined using the formatting context. The form of the resolved value is implementation-defined.
In a declaration, the resolved value of an expression is bound to a variable, which makes it available for use in later expressions and markup options.
For example, in
.input {$a :number minimumFractionDigits=3} .local $b = {$a :integer notation=compact} .match $a 0 {{The value is zero.}} * {{In compact form, the value {$a} is rendered as {$b}.}}
the resolved value bound to
$a
is used as the operand of the:integer
function when resolving the value of the variable$b
, as a selector in the.match
statement, as well as for formatting the placeholder{$a}
.
In an input-declaration, the variable operand of the variable-expression identifies not only the name of the external input value, but also the variable to which the resolved value of the variable-expression is bound.
In a pattern, the resolved value of an expression or markup is used in its formatting.
The form that resolved values take is implementation-dependent, and different implementations MAY choose to perform different levels of resolution.
While this specification does not require it, a resolved value could be implemented by requiring each function handler to return a value matching the following interface:
interface MessageValue { formatToString(): string formatToX(): X // where X is an implementation-defined type getValue(): unknown resolvedOptions(): { [key: string]: MessageValue } selectKeys(keys: string[]): string[] }With this approach:
- An expression could be used as a placeholder if calling the
formatToString()
orformatToX()
method of its resolved value did not emit an error.- A variable could be used as a selector if calling the
selectKeys(keys)
method of its resolved value did not emit an error.- Using a variable, the resolved value of an expression could be used as an operand or option value if calling the
getValue()
method of its resolved value did not emit an error. In this use case, theresolvedOptions()
method could also provide a set of option values that could be taken into account by the called function.Extensions of the base
MessageValue
interface could be provided for different data types, such as numbers or strings, for which theunknown
return type ofgetValue()
and the genericMessageValue
type used inresolvedOptions()
could be narrowed appropriately. An implementation could also allowMessageValue
values to be passed in as input variables, or automatically wrap each variable as aMessageValue
to provide a uniform interface for custom functions.
Expressions are used in declarations and patterns. Markup is only used in patterns.
Depending on the presence or absence of a variable or literal operand and a function, the resolved value of the expression is determined as follows:
If the expression contains a function, its resolved value is defined by function resolution.
Else, if the expression consists of a variable, its resolved value is defined by variable resolution. An implementation MAY perform additional processing when resolving the value of an expression that consists only of a variable.
For example, it could apply function resolution using a function and a set of options chosen based on the value or type of the variable. So, given a message like this:
Today is {$date}
If the value passed in the variable were a date object, such as a JavaScript
Date
or a Javajava.util.Date
orjava.time.Temporal
, the implementation could interpret the placeholder{$date}
as if the pattern included the function:datetime
with some set of default options.
Else, the expression consists of a literal. Its resolved value is defined by literal resolution.
Note
This means that a literal value with no function is always treated as a string. To represent values that are not strings as a literal, a function needs to be provided:
.local $aNumber = {1234 :number}
.local $aDate = {|2023-08-30| :datetime}
.local $aFoo = {|some foo| :foo}
{{You have {42 :number}}}
The resolved value of a text or a literal contains the character sequence of the text or literal after any character escape has been converted to the escaped character.
When a literal is used as an operand or on the right-hand side of an option, the formatting function MUST treat its resolved value the same whether its value was originally a quoted literal or an unquoted literal.
For example, the option
foo=42
and the optionfoo=|42|
are treated as identical.
For example, in a JavaScript formatter the resolved value of a text or a literal could have the following implementation:
class MessageLiteral implements MessageValue { constructor(value: string) { this.formatToString = () => value; this.getValue = () => value; } resolvedOptions: () => ({}); selectKeys(_keys: string[]) { throw Error("Selection on unannotated literals is not supported"); } }
To resolve the value of a variable, its name is used to identify either a local variable or an input variable. If a declaration exists for the variable, its resolved value is used. Otherwise, the variable is an implicit reference to an input value, and its value is looked up from the formatting context input mapping.
The resolution of a variable fails if no value is identified for its name. If this happens, an Unresolved Variable error is emitted. If a variable would resolve to a fallback value, this MUST also be considered a failure.
To resolve an expression with a function, the following steps are taken:
-
If the expression includes an operand, resolve its value. If this fails, use a fallback value for the expression.
-
Resolve the identifier of the function and, based on the starting sigil, find the appropriate function handler to call. If the implementation cannot find the function handler, or if the identifier includes a namespace that the implementation does not support, emit an Unknown Function error and use a fallback value for the expression.
Implementations are not required to implement namespaces or installable function registries.
-
Perform option resolution.
-
Determine the function context for calling the function handler.
The function context contains the context necessary for the function handler to resolve the expression. This includes:
- The current locale, potentially including a fallback chain of locales.
- The base directionality of the expression. By default, this is undefined or empty.
If the resolved mapping of options includes any
u:
options supported by the implementation, process them as specified. Suchu:
options MAY be removed from the resolved mapping of options. -
Call the function implementation with the following arguments:
- The function context.
- The resolved mapping of options.
- If the expression includes an operand, its resolved value.
The form that resolved operand and option values take is implementation-defined.
An implementation MAY pass additional arguments to the function handler, as long as reasonable precautions are taken to keep the function interface simple and minimal, and avoid introducing potential security vulnerabilities.
-
If the call succeeds, resolve the value of the expression as the result of that function call.
If the call fails or does not return a valid value, emit the appropriate Message Function Error for the failure.
Implementations MAY provide a mechanism for the function handler to provide additional detail about internal failures. Specifically, if the cause of the failure was that the datatype, value, or format of the operand did not match that expected by the function, the function SHOULD cause a Bad Operand error to be emitted.
In all failure cases, use the fallback value for the expression as its resolved value.
A function handler is an implementation-defined process such as a function or method which accepts a set of arguments and returns a resolved value. A function handler is required to resolve a function.
An implementation MAY define its own functions and their handlers. An implementation MAY allow custom functions to be defined by users.
Implementations that provide a means for defining custom functions MUST provide a means for function handlers to return resolved values that contain enough information to be used as operands or option values in subsequent expressions.
The resolved value returned by a function handler MAY be different from the value of the operand of the function. It MAY be an implementation specified type. It is not required to be the same type as the operand.
A function handler MAY include resolved options in its resolved value. The resolved options MAY be different from the options of the function.
A function handler SHOULD emit a Bad Operand error for operands whose resolved value or type is not supported.
Function handler access to the formatting context MUST be minimal and read-only, and execution time SHOULD be limited.
Implementation-defined functions SHOULD use an implementation-defined namespace.
Option resolution is the process of computing the options for a given expression. Option resolution results in a mapping of string identifiers to values. The order of options MUST NOT be significant.
For example, the following message treats both both placeholders identically:
{$x :function option1=foo option2=bar} {$x :function option2=bar option1=foo}
For each option:
- Resolve the identifier of the option.
- If the option's right-hand side successfully resolves to a value, bind the identifier of the option to the resolved value in the mapping.
- Otherwise, bind the identifier of the option to an unresolved value in the mapping. Implementations MAY later remove this value before calling the function. (Note that an Unresolved Variable error will have been emitted.)
Errors MAY be emitted during option resolution, but it always resolves to some mapping of string identifiers to values. This mapping can be empty.
Unlike functions, the resolution of markup is not customizable.
The resolved value of markup includes the following fields:
- The type of the markup: open, standalone, or close
- The identifier of the markup
- The resolved options values after option resolution.
If the resolved mapping of options includes any u:
options
supported by the implementation, process them as specified.
Such u:
options MAY be removed from the resolved mapping of options.
The resolution of markup MUST always succeed.
A fallback value is the resolved value for an expression that fails to resolve.
An expression fails to resolve when:
- A variable used as an operand (with or without a function) fails to resolve.
- Note that this does not include a variable used as an option value.
- A function fails to resolve.
The fallback value depends on the contents of the expression:
-
expression with a literal operand (either quoted or unquoted) U+007C VERTICAL LINE
|
followed by the value of the literal with escaping applied to U+005C REVERSE SOLIDUS\
and U+007C VERTICAL LINE|
, and then by U+007C VERTICAL LINE|
.Examples: In a context where
:func
fails to resolve,{42 :func}
resolves to the fallback value|42|
and{|C:\\| :func}
resolves to the fallback value|C:\\|
. -
expression with variable operand referring to a local declaration (with or without a function): the value to which it resolves (which may already be a fallback value)
Examples: In a context where
:func
fails to resolve, the pattern's expression in.local $var={|val|} {{{$var :func}}}
resolves to the fallback value|val|
and the message formats to{|val|}
. In a context where:now
fails to resolve but:datetime
does not, the pattern's expression in.local $t = {:now format=iso8601} .local $pretty_t = {$t :datetime} {{{$pretty_t}}}
(transitively) resolves to the fallback value
:now
and the message formats to{:now}
. -
expression with variable operand not referring to a local declaration (with or without a function): U+0024 DOLLAR SIGN
$
followed by the name of the variableExamples: In a context where
$var
fails to resolve,{$var}
and{$var :number}
both resolve to the fallback value$var
. In a context where:func
fails to resolve, the pattern's expression in.input $arg {{{$arg :func}}}
resolves to the fallback value$arg
and the message formats to{$arg}
. -
function expression with no operand: U+003A COLON
:
followed by the function identifierExamples: In a context where
:func
fails to resolve,{:func}
resolves to the fallback value:func
. In a context where:ns:func
fails to resolve,{:ns:func}
resolves to the fallback value:ns:func
. -
Otherwise: the U+FFFD REPLACEMENT CHARACTER
�
This is not currently used by any expression, but may apply in future revisions.
Option identifiers and values are not included in the fallback value.
Pattern selection is not supported for fallback values.
For example, in a JavaScript formatter the fallback value could have the following implementation, where
source
is one of the above-defined strings:class MessageFallback implements MessageValue { constructor(source: string) { this.formatToString = () => `{${source}}`; this.getValue = () => undefined; } resolvedOptions: () => ({}); selectKeys(_keys: string[]) { throw Error("Selection on fallback values is not supported"); } }
If the message being formatted is not well-formed and valid,
the result of pattern selection is a pattern consisting of a single fallback value
using the message's fallback string defined in the formatting context
or if this is not available or empty, the U+FFFD REPLACEMENT CHARACTER �
.
If the message being formatted does not contain a matcher, the result of pattern selection is its pattern value.
When a message contains a matcher with one or more selectors, the implementation needs to determine which variant will be used to provide the pattern for the formatting operation. This is done by ordering and filtering the available variant statements according to their key values and selecting the first one.
Note
At least one variant is required to have all of its keys consist of
the fallback value *
.
Some selectors might be implemented in a way that the key value *
cannot be selected in a valid message.
In other cases, this key value might be unreachable only in certain locales.
This could result in the need in some locales to create
one or more variants that do not make sense grammatically for that language.
For example, in the
pl
(Polish) locale, this message cannot reach the*
variant:.input {$num :integer} .match $num 0 {{ }} one {{ }} few {{ }} many {{ }} * {{Only used by fractions in Polish.}}
In the Tech Preview, feedback from users and implementers is desired about whether to relax the requirement that such a "fallback variant" appear in every message, versus the potential for a message to fail at runtime because no matching variant is available.
The number of keys in each variant MUST equal the number of selectors.
Each key corresponds to a selector by its position in the variant.
For example, in this message:
.input {$one :number} .input {$two :number} .input {$three :number} .match $one $two $three 1 2 3 {{ ... }}
The first key
1
corresponds to the first selector ($one
), the second key2
to the second selector ($two
), and the third key3
to the third selector ($three
).
To determine which variant best matches a given set of inputs, each selector is used in turn to order and filter the list of variants.
Each variant with a key that does not match its corresponding selector is omitted from the list of variants. The remaining variants are sorted according to the selector's key-ordering preference. Earlier selectors in the matcher's list of selectors have a higher priority than later ones.
When all of the selectors have been processed, the earliest-sorted variant in the remaining list of variants is selected.
This selection method is defined in more detail below. An implementation MAY use any pattern selection method, as long as its observable behavior matches the results of the method defined here.
First, resolve the values of each selector:
- Let
res
be a new empty list of resolved values that support selection. - For each selector
sel
, in source order,- Let
rv
be the resolved value ofsel
. - If selection is supported for
rv
:- Append
rv
as the last element of the listres
.
- Append
- Else:
- Let
nomatch
be a resolved value for which selection always fails. - Append
nomatch
as the last element of the listres
. - Emit a Bad Selector error.
- Let
- Let
The form of the resolved values is determined by each implementation, along with the manner of determining their support for selection.
Next, using res
, resolve the preferential order for all message keys:
- Let
pref
be a new empty list of lists of strings. - For each index
i
inres
:- Let
keys
be a new empty list of strings. - For each variant
var
of the message:- Let
key
be thevar
key at positioni
. - If
key
is not the catch-all key'*'
:- Assert that
key
is a literal. - Let
ks
be the resolved value ofkey
in Unicode Normalization Form C. - Append
ks
as the last element of the listkeys
.
- Assert that
- Let
- Let
rv
be the resolved value at indexi
ofres
. - Let
matches
be the result of calling the method MatchSelectorKeys(rv
,keys
) - Append
matches
as the last element of the listpref
.
- Let
The method MatchSelectorKeys is determined by the implementation.
It takes as arguments a resolved selector value rv
and a list of string keys keys
,
and returns a list of string keys in preferential order.
The returned list MUST contain only unique elements of the input list keys
.
The returned list MAY be empty.
The most-preferred key is first,
with each successive key appearing in order by decreasing preference.
The resolved value of each key MUST be in Unicode Normalization Form C ("NFC"), even if the literal for the key is not.
If calling MatchSelectorKeys encounters any error, a Bad Selector error is emitted and an empty list is returned.
Then, using the preferential key orders pref
,
filter the list of variants to the ones that match with some preference:
- Let
vars
be a new empty list of variants. - For each variant
var
of the message:- For each index
i
inpref
:- Let
key
be thevar
key at positioni
. - If
key
is the catch-all key'*'
:- Continue the inner loop on
pref
.
- Continue the inner loop on
- Assert that
key
is a literal. - Let
ks
be the resolved value ofkey
. - Let
matches
be the list of strings at indexi
ofpref
. - If
matches
includesks
:- Continue the inner loop on
pref
.
- Continue the inner loop on
- Else:
- Continue the outer loop on message variants.
- Let
- Append
var
as the last element of the listvars
.
- For each index
Finally, sort the list of variants vars
and select the pattern:
- Let
sortable
be a new empty list of (integer, variant) tuples. - For each variant
var
ofvars
:- Let
tuple
be a new tuple (-1,var
). - Append
tuple
as the last element of the listsortable
.
- Let
- Let
len
be the integer count of items inpref
. - Let
i
belen
- 1. - While
i
>= 0:- Let
matches
be the list of strings at indexi
ofpref
. - Let
minpref
be the integer count of items inmatches
. - For each tuple
tuple
ofsortable
:- Let
matchpref
be an integer with the valueminpref
. - Let
key
be thetuple
variant key at positioni
. - If
key
is not the catch-all key'*'
:- Assert that
key
is a literal. - Let
ks
be the resolved value ofkey
. - Let
matchpref
be the integer position ofks
inmatches
.
- Assert that
- Set the
tuple
integer value asmatchpref
.
- Let
- Set
sortable
to be the result of calling the methodSortVariants(sortable)
. - Set
i
to bei
- 1.
- Let
- Let
var
be the variant element of the first element ofsortable
. - Select the pattern of
var
.
SortVariants
is a method whose single argument is
a list of (integer, variant) tuples.
It returns a list of (integer, variant) tuples.
Any implementation of SortVariants
is acceptable
as long as it satisfies the following requirements:
- Let
sortable
be an arbitrary list of (integer, variant) tuples. - Let
sorted
beSortVariants(sortable)
. sorted
is the result of sortingsortable
using the following comparator:(i1, v1)
<=(i2, v2)
if and only ifi1 <= i2
.
- The sort is stable (pairs of tuples from
sortable
that are equal in their first element have the same relative order insorted
).
This section is non-normative.
Presuming a minimal implementation which only supports :string
function
which matches keys by using string comparison,
and a formatting context in which
the variable reference $foo
resolves to the string 'foo'
and
the variable reference $bar
resolves to the string 'bar'
,
pattern selection proceeds as follows for this message:
.input {$foo :string}
.input {$bar :string}
.match $foo $bar
bar bar {{All bar}}
foo foo {{All foo}}
* * {{Otherwise}}
-
For the first selector:
The value of the selector is resolved to be'foo'
.
The available keys «'bar'
,'foo'
» are compared to'foo'
,
resulting in a list «'foo'
» of matching keys. -
For the second selector:
The value of the selector is resolved to be'bar'
.
The available keys «'bar'
,'foo'
» are compared to'bar'
,
resulting in a list «'bar'
» of matching keys. -
Creating the list
vars
of variants matching all keys:
The first variantbar bar
is discarded as its first key does not match the first selector.
The second variantfoo foo
is discarded as its second key does not match the second selector.
The catch-all keys of the third variant* *
always match, and this is added tovars
,
resulting in a list «* *
» of variants. -
As the list
vars
only has one entry, it does not need to be sorted.
The patternOtherwise
of the third variant is selected.
Alternatively, with the same implementation and formatting context as in Example 1, pattern selection would proceed as follows for this message:
.input {$foo :string}
.input {$bar :string}
.match $foo $bar
* bar {{Any and bar}}
foo * {{Foo and any}}
foo bar {{Foo and bar}}
* * {{Otherwise}}
-
For the first selector:
The value of the selector is resolved to be'foo'
.
The available keys «'foo'
» are compared to'foo'
,
resulting in a list «'foo'
» of matching keys. -
For the second selector:
The value of the selector is resolved to be'bar'
.
The available keys «'bar'
» are compared to'bar'
,
resulting in a list «'bar'
» of matching keys. -
Creating the list
vars
of variants matching all keys:
The keys of all variants either match each selector exactly, or via the catch-all key,
resulting in a list «* bar
,foo *
,foo bar
,* *
» of variants. -
Sorting the variants:
The listsortable
is first set with the variants in their source order and scores determined by the second selector:
« ( 0,* bar
), ( 1,foo *
), ( 0,foo bar
), ( 1,* *
) »
This is then sorted as:
« ( 0,* bar
), ( 0,foo bar
), ( 1,foo *
), ( 1,* *
) ».
To sort according to the first selector, the scores are updated to:
« ( 1,* bar
), ( 0,foo bar
), ( 0,foo *
), ( 1,* *
) ».
This is then sorted as:
« ( 0,foo bar
), ( 0,foo *
), ( 1,* bar
), ( 1,* *
) ». -
The pattern
Foo and bar
of the most preferredfoo bar
variant is selected.
A more-complex example is the matching found in selection APIs
such as ICU's PluralFormat
.
Suppose that this API is represented here by the function :number
.
This :number
function can match a given numeric value to a specific number literal
and also to a plural category (zero
, one
, two
, few
, many
, other
)
according to locale rules defined in CLDR.
Given a variable reference $count
whose value resolves to the number 1
and an en
(English) locale,
the pattern selection proceeds as follows for this message:
.input {$count :number}
.match $count
one {{Category match for {$count}}}
1 {{Exact match for {$count}}}
* {{Other match for {$count}}}
-
For the selector:
The value of the selector is resolved to an implementation-defined value that is capable of performing English plural category selection on the value1
.
The available keys «'one'
,'1'
» are passed to the implementation's MatchSelectorKeys method,
resulting in a list «'1'
,'one'
» of matching keys. -
Creating the list
vars
of variants matching all keys:
The keys of all variants are included in the list of matching keys, or use the catch-all key,
resulting in a list «one
,1
,*
» of variants. -
Sorting the variants:
The listsortable
is first set with the variants in their source order and scores determined by the selector key order:
« ( 1,one
), ( 0,1
), ( 2,*
) »
This is then sorted as:
« ( 0,1
), ( 1,one
), ( 2,*
) » -
The pattern
Exact match for {$count}
of the most preferred1
variant is selected.
After pattern selection, each text and placeholder part of the selected pattern is resolved and formatted.
Resolved values cannot always be formatted by a given implementation. When such an error occurs during formatting, an appropriate Message Function Error is emitted and a fallback value is used for the placeholder with the error.
Implementations MAY represent the result of formatting using the most appropriate data type or structure. Some examples of these include:
- A single string concatenated from the parts of the resolved pattern.
- A string with associated attributes for portions of its text.
- A flat sequence of objects corresponding to each resolved value.
- A hierarchical structure of objects that group spans of resolved values, such as sequences delimited by markup-open and markup-close placeholders.
Implementations SHOULD provide formatting result types that match user needs, including situations that require further processing of formatted messages. Implementations SHOULD encourage users to consider a formatted localised string as an opaque data structure, suitable only for presentation.
When formatting to a string, the default representation of all markup MUST be an empty string. Implementations MAY offer functionality for customizing this, such as by emitting XML-ish tags for each markup.
This section is non-normative.
-
An implementation might choose to return an interstitial object so that the caller can "decorate" portions of the formatted value. In ICU4J, the
NumberFormatter
class returns aFormattedNumber
object, so a pattern such asThis is my number {42 :number}
might return the character sequenceThis is my number
followed by aFormattedNumber
object representing the value42
in the current locale. -
A formatter in a web browser could format a message as a DOM fragment rather than as a representation of its HTML source.
If the resolved pattern includes any fallback values
and the formatting result is a concatenated string or a sequence of strings,
the string representation of each fallback value MUST be the concatenation of
a U+007B LEFT CURLY BRACKET {
,
the fallback value as a string,
and a U+007D RIGHT CURLY BRACKET }
.
For example, a message that is not well-formed would format to a string as
{�}
, unless a fallback string is defined in the formatting context, in which case that string would be used instead.
Messages contain text. Any text can be bidirectional text. That is, the text can can consist of a mixture of left-to-right and right-to-left spans of text. The display of bidirectional text is defined by the Unicode Bidirectional Algorithm [UAX9].
The directionality of the formatted message as a whole is provided by the formatting context.
Note
Keep in mind the difference between the formatted output of a message, which is the topic of this section, and the syntax of message prior to formatting. The processing of a message depends on the logical sequence of Unicode code points, not on the presentation of the message. Affordances to allow users appropriate control over the appearance of the message's syntax have been provided.
When a message is formatted, placeholders are replaced with their formatted representation. Applying the Unicode Bidirectional Algorithm to the text of a formatted message (including its formatted parts) can result in unexpected or undesirable spillover effects. Applying bidi isolation to each affected formatted value helps avoid this spillover in a formatted message.
Note that both the message and, separately, each placeholder need to have direction metadata for this to work. If an implementation supports formatting to something other than a string (such as a sequence of parts), the directionality of each formatted placeholder needs to be available to the caller.
If a formatted expression itself contains spans with differing directionality, its formatter SHOULD perform any necessary processing, such as inserting controls or isolating such parts to ensure that the formatted value displays correctly in a plain text context.
For example, an implementation could provide a
:currency
formatting function which inserts strongly directional characters, such as U+200F RIGHT-TO-LEFT MARK (RLM), U+200E LEFT-TO-RIGHT MARK (LRM), or U+061C ARABIC LETTER MARKER (ALM), to coerce proper display of the sign and currency symbol next to a formatted number. An example of this is formatting the value-1234.56
as the currencyAED
in thear-AE
locale. The formatted value appears like this:-1,234.56 د.إ.
The code point sequence for this string, as produced by the ICU4J
NumberFormat
function, includes U+200F U+200E at the start and U+200F at the end of the string. If it did not do this, the same string would appear like this instead:
A bidirectional isolation strategy is functionality in the formatter's processing of a message that produces bidirectional output text that is ready for display.
The Default Bidi Strategy is a bidirectional isolation strategy that uses isolating Unicode control characters around placeholder's formatted values. It is primarily intended for use in plain-text strings, where markup or other mechanisms are not available. Implementations MUST provide the Default Bidi Strategy as one of the bidirectional isolation strategies.
Implementations MAY provide other bidirectional isolation strategies.
Implementations MAY supply a bidirectional isolation strategy that performs no processing.
The Default Bidi Strategy is defined as follows:
- Let
msgdir
be the directionality of the whole message, one of «'LTR'
,'RTL'
,'unknown'
». These correspond to the message having left-to-right directionality, right-to-left directionality, and to the message's directionality not being known. - For each expression
exp
in pattern:- Let
fmt
be the formatted string representation of the resolved value ofexp
. - Let
dir
be the directionality offmt
, one of «'LTR'
,'RTL'
,'unknown'
», with the same meanings as formsgdir
. - If
dir
is'LTR'
:- If
msgdir
is'LTR'
in the formatted output, letfmt
be itself - Else, in the formatted output,
prefix
fmt
with U+2066 LEFT-TO-RIGHT ISOLATE and postfix it with U+2069 POP DIRECTIONAL ISOLATE.
- If
- Else, if
dir
is'RTL'
:- In the formatted output,
prefix
fmt
with U+2067 RIGHT-TO-LEFT ISOLATE and postfix it with U+2069 POP DIRECTIONAL ISOLATE.
- In the formatted output,
prefix
- Else:
- In the formatted output,
prefix
fmt
with U+2068 FIRST STRONG ISOLATE and postfix it with U+2069 POP DIRECTIONAL ISOLATE.
- In the formatted output,
prefix
- Let