-
-
Notifications
You must be signed in to change notification settings - Fork 294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Behavior around failed $ref resolution is not defined #1276
Comments
@gregsdennis I see what you're doing with the {
"$ref": "https://uresolvable.uri/schema"
} is under-specified. My vote is for runtime error (as you know but folks who weren't on Slack wouldn't). |
Right, my example would either pass validation (because of the Given the two optional behaviors, yours would either fail validation or halt. |
The way I see it, if the referenced schema doesn't exist, then the result of validation of an instance against the schema isn't true or false, it's indeterminate. We don't have the referenced schemas, so we don't know if it would have resulted in true or false. Therefore, the result of validating a schema against an instances shouldn't be true or false because it contains an indeterminate result. It doesn't necessarily need to be a runtime error, but it needs to be a result other than true or false. Generally that means throwing an error, but I don't think it must be. If the implementation has another way of representing an indeterminate result, that would be fine too. |
I think the issue is that the output in the specification assumes that evaluation completed successfully, so it doesn't provision a way to represent an indeterminate result. Maybe that's something that needs consideration. Regardless, it seems we agree that halting evaluation is the right action; just how to report it to the consumer is the remaining question. Perhaps this is fine to leave as implementation-defined (as much as that's bitten us before). This leaves the door open for implementations to decide in their context whether an exception is necessary vs some other mechanism. Either way, they should document it. |
@gregsdennis I think this gets into some really important design questions that we've touched on in our IM discussions. my mental model for JSON Schema evaluation is something like: evaluator = JSONSchemaImplementationThing()
evaluator.load(schema1) # might raise an exception if schema cannot be scanned for identifiers, etc.
evaluator.load(schema2, retrieval_uri=uri2, media_type_params={schema: meta_schema2)
result = evaluator.evaluate(instance, schema_uri_1) # might raise an exception for unresolvable reference or whatever
if result.valid:
annotations = result.output('basic')
do_stuff(annotations)
else:
errors = result.output('hierarchical')
report_errors(errors) So I would not expect the output format to handle the indeterminate result. I've mentioned before that I'm wary of making the output structure the One True Way to interact with schema evaluation. It is an interoperable report of the evaluation, and that's important, but we shouldn't lose the flexibility of native programming language idioms either. As noted in the last community call, generating a complete output structure can be very expensive in both performance and memory consumption. In many cases, a more interactive, idiomatic approach would be preferred. @Julian has a really awesome exception interface for working with errors, including a programmatically queryable tree interface. (I have not checked to see if the way you invoke his implementation is similar to what I wrote above, but it fits in the sense that there's a thing you can interact with after evaluation). There are places where we need the interoperability: For example, a microservice architecture where one service runs JSON Schema in Python and another in .NET, but it's the front-end code in JavaScript that needs to interpret the results. But we don't want to get so focused on "output" that we preclude other modes of interaction. Even in the microservices environment, a runtime problem such as an unresolved reference might be better handled by returning an HTTP error in |
(let me just note that I am aware that @gregsdennis was not suggesting that output was the only way to handle this, and I should have made that more clear before talking about possible alternatives that I wanted to get into this conversation) |
As discussed elsewhere, the output as defined by the specification isn't necessarily what an application (e.g. a microservice API) need return; it's what is returned from the implementation to the application. It's perfectly fine for the application to translate that into |
I was making a distinction between using the full error output (in which case I would probably embed it in an But I can see how runtime errors could fit into the output format given that the location information is helpful either way, in which case we'd just need a 3rd value ( I am absolutely not against using output for this, I'm just trying to figure out where we want interoperability (which comes at an efficiency cost) vs where we want efficiency. |
I agree that output (at least as it currently exists) isn't right for this kind of error. I think an implementation should use whatever native language support exists. I've made an attempt at covering this in my data v2 PR (section 4.4). This is what spawned the question in the first place. |
So we've agreed that unresolvable reference failures should be handled in-language. @Julian do you think there's a way we can test this with the suite? |
We've had this come up a few times on the test suite, most recently here and here. (n.b. both of those cover slightly different variations on the theme. I'm not sure it's worth your time to read through them, though obviously feel free to if you'd like). I'm OK with us adding another folder and sticking some of these in them. The "issue" if there is one is that there isn't really anything interesting we can say about the result (i.e. having an equivalent of the So I think the best first step is simply to have a file with an array of the schemas and descriptions themselves, and no assertion. But yeah feel free to send over a small PR (or otherwise stick it in an issue). |
@Julian I'm not sure I understand the "no assertion" part. Why wouldn't we have a value (e.g. |
|
So people writing test executors would write something like: if 'valid' in test_case:
# run validation and check against the valid field
else:
# run valid and fail the test if there's no error |
I think it'll be even easier to match based on the directory and we not mix test types between the two -- so errors/ contains tests with different structure. |
(and then no existing runners break, since they're not looking in the new directory anyhow) |
Oh I see what you mean, that wasn't clear. Could you file the issue about the proposed new directory structures soon, please? Directory-based behavior keeps coming up and there's nothing to point to for context about where you're going with it right now. |
I agree, an "unreconcilable" error should be produced. This is what I do:
Essentially, a normal error is produced for the appropriate keyword+dataLocation, and then an exception is thrown to be caught by the top level of evaluation (before the recursive processing of the schema begins), so it cannot be paired with an The result object, which is returned from the evaluation call and contains all the error objects, also has an FWIW, the other ways in which such exceptional errors can be generated _at runtime are:
|
@karenetheridge Does your implementation halt execution when that's encountered, or do you continue and just report it similar to a validation error? Unknown vocabs should only be an error if the value for that vocab is Unknown media types (and the |
This is no longer a concern in 2020-12, as these keywords are just annotations.
This, and certain situations involving |
Evaluation is halted, in the sense that an exception is thrown to the top of the evaluation scope, but an exception is not thrown right back to the caller of the evaluator. The evaluation returns a result object as normal, with the exception represented as an error. (Any errors encountered earlier in the evaluation, for example from a sibling
I think you mean
|
Yes, thanks. |
(By "is not defined" I don't mean that it's considered and left as "undefined" behavior; I mean it's not addressed at all.)
In Slack, we had a short discussion around the behavior of this schema:
An implementation could handle this in one of two ways:
The discussion landed on the latter: produce a runtime error, halting evaluation.
We need to add language to spec to properly prescribe this behavior.
For reference:
$ref
-ing to non-schemasThe text was updated successfully, but these errors were encountered: