Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filters with uri-prefixed attribute names don't work #116

Closed
gsar opened this issue Apr 6, 2024 · 7 comments
Closed

filters with uri-prefixed attribute names don't work #116

gsar opened this issue Apr 6, 2024 · 7 comments

Comments

@gsar
Copy link
Contributor

gsar commented Apr 6, 2024

The ABNF in the SCIM RFC requires filters to allow attributes to be fully-qualified (i.e. spelled with their URI prefixes), but that doesn't seem to work in filter expressions like it does within patch ops. Only the unqualified form seems to work in filters, but SCIM implementations like Microsoft Entra appear to only support filters that have the fully qualified name. Potentially an easy fix.

curl -H 'Authorization: Bearer deadbeef' -X GET 'https://example.com/scim/Users?filter=urn:ietf:params:scim:schemas:extension:enterprise:2.0:User:employeeId+eq+%22gsar%22' | jq .
{
  "schemas": [
    "urn:ietf:params:scim:api:messages:2.0:Error"
  ],
  "detail": "The specified filter syntax was invalid, or the specified attribute and filter comparison combination is not supported:\nCan't lex input here ':ietf:params:scim:schemas:extension:enterprise:2.0:User:employeeId eq \"gsar\"'",
  "status": "400",
  "scimType": "invalidFilter"
}
@pond
Copy link
Member

pond commented Jun 13, 2024

Looking into it...

It's wild how every person describing an issue with Entra has totally different payloads coming from it. There seem to be no two the same. How on earth Microsoft have managed to apparently independently reimplement SCIM over and over in different ways, and what it is that makes them "choose" which one of those to invoke, is anyone's guess. It's not like Microsoft don't have prior examples of doing weird stuff but this is definitely towards the top of the "most strange" list!

@pond
Copy link
Member

pond commented Jun 13, 2024

The square bracket filter form of this is a total pig. Unless you know what schema IDs to expect, you can't figure out what's what. Consider:

'urn:ietf:params:scim:schemas:extension:enterprise:2.0:User:employeeId eq "gsar"
'urn:ietf:params:scim:schemas:extension:enterprise:2.0:User[employeeId eq "gsar"]

Now, a bit of background info first - Scimitar has a widespread internal limitation that 'flattens' (i.e. removes) schema IDs everywhere, which is fine - so long as no two extensions have the same attribute name, of course, since by removing the schema ID during e.g. PATCH path traversal we lose what amounted to the namespace that the schema ID provided. For the standardised SCIM extensions that's certainly OK as far as I'm aware. Fixing this limitation would be a tonne of work. to overhaul large amounts of how SCIM attribute maps are defined and how things like the PATCH parser works (but would be a good idea to do, one day!). If that happened, I'd consider it enough to bump the gem's major version number and expect client code to require significant updates to match Scimitar's new arising API requirements.

In v1/v2, though, the way we can handle that query filter above is to just remove the schema IDs from it. But how do we know which part is schema ID, and which part is attribute name? If we didn't know what schema IDs are "valid" in the query, all we can do is parse out the URN and assume - in the above case as an example - that if we hit a "[", then the thing after is an attribute and everything before is the schema ID; or if we hit a space before seeing a [, then the last colon in the URN separates the schema ID from attribute.

Given that, we can remove what we now recognise as the schema ID, yielding:

'employeeId eq "gsar"
'[employeeId eq "gsar"]

...and post-process that last example to replace the square brackets with parentheses (so that we don't need to worry about any subsequent operators that might be present in the filter - notably, things like or).

So far so good... But SCIM being SCIM, it's super complex with 50 ways to do everything instead of keeping it a lean spec with just one way! This means we can also split an attribute path with a "[". We see that done routinely for things like this:

emails.type eq "work"
emails[type eq "work"]

For the schema ID equivalent of that, let's say we wanted to match a manager's displayName. There are at least three ways to do that.

'urn:ietf:params:scim:schemas:extension:enterprise:2.0:User:manager.displayName eq "gsar"
'urn:ietf:params:scim:schemas:extension:enterprise:2.0:User:manager[displayName eq "gsar"]
'urn:ietf:params:scim:schemas:extension:enterprise:2.0:User[manager.displayName eq "gsar]"

Without knowing how much of the colon-separated part is schema ID and which bit starts the attribute path, then the first and third examples above work with the logic described for the employeeId case. But of course, that middle one is broken. Our simple parser could not possibly know that the schema ID was not supposed to be urn:ietf:params:scim:schemas:extension:enterprise:2.0:User:manager, and would thus normalise the form to an incorrect displayName eq "gsar" instead of the correct manager.displayName eq "gsar".

Currently the only thing I can thing of - bearing in mind that without breaking API changes, Scimitar's GET request filter parser doesn't know what SCIM resource class is being queried - is to enumerate known schema extensions, grab the IDs, and use those. If a query came in with an unrecognised schema ID, it'd be left in place and the filter wouldn't find anything (or could even raise an error) but we can assume that if it was trying to query attributes in schemas identified by IDs which the Scimitar application knows nothing about, it was never going to successfully find anything anyway.

@pond
Copy link
Member

pond commented Jun 13, 2024

...and I've added #130 to track that, along with including a README.md  note about the limitation in the docs that talk about extension schemas, within the PR I'm building now.

@pond
Copy link
Member

pond commented Jun 13, 2024

Right, with a bit of luck, #131 should do the trick, though I don't have a real-world Entra instance to test it with.

@pond
Copy link
Member

pond commented Jun 13, 2024

Fix released in https://rubygems.org/gems/scimitar/versions/2.8.0. I'll backport to V1 within a week or two.

@pond pond closed this as completed Jun 13, 2024
@gsar
Copy link
Contributor Author

gsar commented Jun 13, 2024

@pond a couple of notes / observations:

  • microsoft appears to have a bunch of fix featureflags they will enable selectively for customers who run into their SCIM compliance issues. it beats me why they don't release these things for general availability to keep everyone on the same page (perhaps it has something to do with the fixes "breaking" people who may be relying on the prior behaviors). just noting that this could be contributing to the variety of behaviors people report.
  • for handling the bracketed filters (schema[arbitrary filter] form), it can be simpler to normalize that to the schemaless non-bracketed form. so schema[employeeId eq "stuff"] just becomes schema:employeeId eq "stuff". i haven't run into this usage by microsoft, so it would be fine to not bother with supporting this at this stage.

thanks for working on these fixes, i'll try them out soon and report back in a week or two.

@pond
Copy link
Member

pond commented Jun 27, 2024

Now available for Rails 6 via v1.10.0 too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants