Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify pattern resolution behavior for patterns with syntax errors #462

Closed
wants to merge 1 commit into from

Conversation

catamorphism
Copy link
Collaborator

@catamorphism catamorphism commented Aug 29, 2023

While testing my ICU4C implementation on the JSON tests, I noticed a test case that reflects unspecified behavior:

    "src": "no braces",
    "exp": "{no braces}",

(See line 235 of test-messages.json).

This pull request adds spec text stating that patterns that do not begin with a '{' format to their contents enclosed in '{'/'}'.

Copy link
Collaborator

@eemeli eemeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a bug in the test, and the spec should not need to be changed to accommodate it. This is a syntax error, and its expected behaviour is already covered:

If the message being formatted has any Syntax or Data Model errors, the result of pattern selection MUST be a pattern resolving to a single fallback value using the message's fallback string defined in the formatting context or if this is not available or empty, the U+FFFD REPLACEMENT CHARACTER .

Copy link
Member

@aphillips aphillips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the right idea, but in the wrong place. See my longer comment below.

Comment on lines +240 to +251
### Messages without Selectors

When a _message_ does not contain a _match_ construct,
and does not begin with a U+007B LEFT CURLY BRACKET `{`,
the result of pattern selection MUST be the concatenation of
a U+007B LEFT CURLY BRACKET `{`,
the contents of the message as a string,
and a U+007D RIGHT CURLY BRACKET `}`.

> For example, the following message:
> `no braces`
> would format to a string as `{no braces}`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is "pattern selection". This is a parse error, since the message is not, well, a message. What's more, the message "no braces {some braces}" is not a message.

I think the thing that is missing in formatting.md is the first step, which is: parsing. This should result in an syntax error. We could then go on to define what happens to the string that wants to be a message.

To be honest, I would not add the curly brackets. If you give me a string and it produces a syntax error, I should probably give you back the string as you gave it to me. If you thought it should be a valid message, you can debug it. If you are a runtime process, you can just display it (as we didn't damage it).


Note well: I argued long and hard that a plain old string with no placeholders should be a valid message. This is the old argument code-mode vs. string-mode. I am not reopening this, but we're going to tangle with the fact that you have to grope strings before feeding them to MF2 as a result.

@aphillips
Copy link
Member

@eemeli noted:

I think that's a bug in the test, and the spec should not need to be changed to accommodate it. This is a syntax error, and its expected behaviour is already covered:

If the message being formatted has any Syntax or Data Model errors, the result of pattern selection MUST be a pattern resolving to a single fallback value using the message's fallback string defined in the formatting context or if this is not available or empty, the U+FFFD REPLACEMENT CHARACTER .

I agree with your comment, but the structure of formatting.md is:

  • resolve expressions
  • select pattern
  • format results

So the quoted text is buried inside "select pattern", whereas, as I noted, probably we want to parse the message before resolving declarations or trying to select the pattern.

Does that make sense?

@eemeli
Copy link
Collaborator

eemeli commented Aug 30, 2023

I agree with your comment, but the structure of formatting.md is:

  • resolve expressions
  • select pattern
  • format results

So the quoted text is buried inside "select pattern", whereas, as I noted, probably we want to parse the message before resolving declarations or trying to select the pattern.

The text I quoted is from the fourth top-level part of formatting.md, "Error Handling".

@aphillips
Copy link
Member

@eemeli

The text I quoted is from the fourth top-level part of formatting.md, "Error Handling".

Okay. But do you agree that we're missing the parsing?

@eemeli
Copy link
Collaborator

eemeli commented Aug 31, 2023

I agree that we do not have an explicit definition of parsing besides the description of the syntax, but I'm not really sure what more we need to say about it. If we do need to say more about parsing, that probably ought to go under Syntax rather than Formatting, as parsing is required for other message actions as well.

The introduction of our formatting section currently starts with this, which I think appropriately defines its scope:

This document defines the behaviour of a MessageFormat 2.0 implementation when formatting a message for display in a user interface, or for some later processing.

To start, we presume that a message has either been parsed from its syntax or created from a data model description. If this construction has encountered any Syntax or Data Model errors, their handling during formatting is specified here as well.

As stated there, the handling of syntax errors during formatting is defined, and that may indeed be found under Error Handling.

@eemeli
Copy link
Collaborator

eemeli commented Oct 8, 2023

@catamorphism Are you still looking to advance this PR, or should it be considered abandoned and closed?

@catamorphism
Copy link
Collaborator Author

catamorphism commented Oct 31, 2023

I think that's a bug in the test, and the spec should not need to be changed to accommodate it. This is a syntax error, and its expected behaviour is already covered:

If the message being formatted has any Syntax or Data Model errors, the result of pattern selection MUST be a pattern resolving to a single fallback value using the message's fallback string defined in the formatting context or if this is not available or empty, the U+FFFD REPLACEMENT CHARACTER .

What should the expected result be in the test? Currently the expected result is {no braces}.

@catamorphism
Copy link
Collaborator Author

@catamorphism Are you still looking to advance this PR, or should it be considered abandoned and closed?

I just returned from leave today. Yes, I think this question still requires clarification, if only by changing the test (see previous comments).

@eemeli
Copy link
Collaborator

eemeli commented Oct 31, 2023

What should the expected result be in the test? Currently the expected result is {no braces}.

While you were out, we kinda went and started to do a number on the syntax, so for this test the expected result is now just no braces, as the braces are no longer required at least for simple messages like this.

In general, for syntax and data model errors the expected result is {�} -- which is also the format's logo now.

@catamorphism
Copy link
Collaborator Author

I think this PR is obsolete now due to syntax changes. If I notice related issues I'll start a new PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants