Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add design doc for expression attributes #458

Merged
merged 1 commit into from
Sep 4, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
137 changes: 137 additions & 0 deletions exploration/0002-expression-attributes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
# Expression Attributes

<details>
<summary>Metadata</summary>
<dl>
<dt>Contributors</dt>
<dd>@eemeli</dd>
<dt>First proposed</dt>
<dd>2023-08-27</dd>
<dt>Pull Request</dt>
<dd><a href="https://github.com/unicode-org/message-format-wg/pull/458">#458</a></dd>
</dl>
</details>

## Objective

Define how attributes may be attached to expressions.

## Background

Function options may influence the resolution, selection, and formatting of annotated expressions.
These provide a great solution for options like `minFractionDigits`, `dateStyle`,
or other similar factors that influence the formatted result.

However, this single bag of options is not appropriate in all cases,
in particular for attributes that pertain to the expression as a selector or a placeholder.
For example, many of the [XLIFF 2 inline element] attributes don't really make sense as function options.

[XLIFF 2 inline element]: http://docs.oasis-open.org/xliff/xliff-core/v2.1/os/xliff-core-v2.1-os.html#inlineelements

## Use-Cases

At least the following expression attributes should be considered:

- Attributes with a formatting runtime impact:

- `fallback` — A value to use instead of the default fallback,
should the expression's primary formatting fail in some way.
- `locale` — An override for the locale used to format the expression.
Should be expressed as a non-empty sequence of BCP 47 language codes.
- `dir` — An override for the LTR/RTL/auto directionality of the expression.

- Attributes relevant for translators, tools, and other message operations,
but with no runtime impact:

- `example` — A literal value representing
what the expression's formatted value will look like.
- `note` — A comment on the expression for translators.
- `translate` — A boolean `yes`/`no` indicator communicating to translators
whether the expression should or should not be localised.
- `canCopy`, `canDelete`, `canOverlap`, `canReorder`, etc. — Flags supported by
XLIFF 2 inline elements

## Requirements

Attributes can be assigned to any expression,
including expressions without an annotation.

Attributes are distinct from function options.

Common attributes are defined by the MF2 specification
and must be supported by all implementations.

Users may define their own attributes.

Implementations may define their own attributes.

Some attributes may have an effect on the formatting of an expression.
These cannot be defined within comments either within or outside a message.

Each attribute relates to a specific expression.

An attribute's scope is limited to the expression to which it relates.

Multiple attributes should be assignable to a single expression.

Attributes should be assignable to all expressions, not just placeholders.

## Constraints

If supported by new syntax,
the syntax should be easy to parse by both humans and machines.

If supported by new syntax at the end of an expression,
the reserved/private-use rules will need to be adjusted to support attributes.

## Proposed Design

Add support for option-like `@key=value` attribute pairs at the end of any expression.
eemeli marked this conversation as resolved.
Show resolved Hide resolved

If the syntax for function options is extended to support flag-like options
(see <a href="https://github.com/unicode-org/message-format-wg/issues/386">#386</a>),
also extend expression attribute syntax to match.

To distinguish expression attributes from options,
require `@` as a prefix for each attribute asignment.
Examples: `@translate=yes` and `@locale=$exprLocale`.

Define the meaning and supported values of some expression attributes in the specification,
including at least `@dir` and `@locale`.

To support later extension of the specified set of attributes while allowing user extensibility,
require custom attribute names to include a U+002D Hyphen-Minus `-`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned elsewhere, I would prefer a prefix to name content.

Thought: will there be classes or packages of attributes that can be plugged in, e.g.:

{$foo @its:translate=no @Xliff:canCopy=yes}

Examples: `@can-copy=no`, `@note-link=|https://...|`.

Allow expression attributes to influence the formatting context,
but do not directly pass them to user-defined functions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would tighten this up. I would imagine that we'll add to the spec text similar to:

An implementation MUST provide functions the ability to query the set of attributes that apply to the calling expression.

(I didn't say "an API" on purpose)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you're proposing is actually the opposite of what I'm proposing. The formatting context as currently specified is an explicit set of fields, extensible by an implementation but not custom functions:

Implementations MAY include additional fields in their formatting context.

So the proposal here would allow an implementation to modify the formatting context as it's applied to an expression, but it would not allow for a custom function definition "to query the set of attributes".

If it makes sense for an attribute to influence the behaviour of a custom function, and that attribute isn't an explicit part of the implementation's formatting context, why should its value be communicated via an expression attribute rather than a function option?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A formatting context would be "a way to query the set of attributes"? I like your wording:

Implementations MAY include additional fields in their formatting context.

Although my question would be: how do I ensure that my custom annotation reaches my function's runtime via the context?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That wording is from our current formatting spec, actually.

how do I ensure that my custom annotation reaches my function's runtime via the context?

By using a function option to convey that information, rather than an expression attribute.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking here goes:

  • We are defining a feature "expression attributes"
  • We allow implementation or user defined attributes
  • How do these attributes reach the runtime?

The answer cannot be function options or I would have used a function option and not an expression attribute.

Apparently the answer is: the attributes appear in the formatting context and we can just say that?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that situation, why not use a function option? If it's something affecting the behaviour of a single function and its value can be defined in the expression, why not use a function option for it?

Apparently the answer is: the attributes appear in the formatting context and we can just say that?

We have not defined the shape of the formatting context with any specificity, and I don't think we should -- it's really an implementation detail. If an implementation wants to make its formatting context extensible in some way and for a custom expression attribute to affect it, I think that's very much an implementation detail and outside the scope of this spec.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have to define the shape of the formatting context with any specificity in order to say that implementations must provide access to any attribute values in a message's context. We already have at least one contextual value (the locale) that we require the implementation to ensure.

So I agree that we don't want to define exactly how a function gets access to the context. We might not even require that there be a way for it to be obtained. But we can still require that any contextual mechanism (using whatever shape) provides access to the attribute.

If we don't follow my thinking through, wouldn't we be introducing a feature that doesn't work in at least some implementations?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To rephrase and reiterate my own position on this and the currently proposed phrase, I do not think that functions should have access to the expression attributes or their values.

I do not think it would be a good idea for implementations to be required to provide access to any attribute values in a message's context. As we already have function options, I do not think that expression attributes should be duplicating their functionality. Allowing custom functions to access expression attributes would do exactly that.

Message formatting happens in some context that has properties such as the current locale, timezone, etc. that have an effect on the results. In the spec, we refer to this as the "formatting context", but do not exhaustively define its shape or properties. This is implementation-defined, and implementations may internally allow for its extensibility.

The general use case I have in mind for expression attributes that have a formatting impact is to provide a way to locally override formatting context values or error handling behaviour.

What is the use case for allowing custom functions to access custom attributes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the use case for allowing custom functions to access custom attributes?

What is the use case for allowing custom attributes? If we allow them, don't we need to provision them? If we don't allow them, won't implementations re-invent them to suit their own needs?

On some level, all functions are custom functions. They are all described through the function registry and made available to message writers.


## Alternatives Considered

### Do not support expression attributes

If not explicitly defined, less information will be provided to translators.

Function options may be used as a workaround,
but each implementation and user will end up with different practices.

### Use function options, but with some suggested prefix like `_`

A bit less bad than the previous, but still mixes attributes and options into the same namespace.

At least a no-op function is required for otherwise unannotated expressions.

### Rely on semantic comments

These will be defined within the message resource spec,
so we introduce a dependence on that.

Referring to specific expressions from outside the message is hard,
esp. if a similar expression is used in multiple variants.
Comment on lines +129 to +130
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it's easy--but it isn't machine readable. In addition, there is no guarantee that resource-level commentary will be preserved during the translation process or be operable to MT or CAT tools. This is particularly true if the translation process segments inside the MF pattern.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's hard if there is more than one expression with the exact same syntax within a message.

For example, consider this message:

let $count = {$count :number}
match {$count}
when one {{$count} thing}
when * {{$count} things}

With that, how can I assign the equivalent of @example=2 to the {$count} in the * variant, but not the match selector or the placeholder in the one variant?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With that, how can I assign the equivalent of @example=2 to the {$count} in the * variant, but not the match selector or the placeholder in the one variant?

Actually you just did an admirable--but not machine readable--job of it:

@comment="@example=2 to the {$count} in the * variant, but not the match selector or the placeholder in the one variant"

My first sentence is more of a bit of humor, really. We're in violent agreement here. The problem I wanted to highlight was the separation represents a probably loss of functionality as soon as you parse the message.


Comments should not influence the runtime behaviour of a formatter.
aphillips marked this conversation as resolved.
Show resolved Hide resolved

### Define `@attributes` as above, but do not namespace custom attributes

Later spec versions will not be able to define _any_ new attributes
without a danger of breaking implementations or users already using those names.