Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: optional elements are not appearing as optional when children of another element #410

Open
ethindp opened this issue Jul 29, 2024 · 13 comments
Labels
enhancement New feature or request

Comments

@ethindp
Copy link

ethindp commented Jul 29, 2024

I have this (RelaxNG compact):

muclient = element muclient {
    (include_* & 
     (plugin? & world? & triggers? & aliases? & timers? & macros? & variables? & colours? & keypad? & printing?)
    )
}

As an XSD it renders as (via trang):

  <xs:element name="muclient">
    <xs:complexType>
      <xs:choice minOccurs="0" maxOccurs="unbounded">
        <xs:element ref="include"/>
        <xs:choice>
          <xs:element ref="plugin"/>
          <xs:element ref="world"/>
          <xs:element ref="triggers"/>
          <xs:element ref="aliases"/>
          <xs:element ref="timers"/>
          <xs:element ref="macros"/>
          <xs:element ref="variables"/>
          <xs:element ref="colours"/>
          <xs:element ref="keypad"/>
          <xs:element ref="printing"/>
        </xs:choice>
      </xs:choice>
    </xs:complexType>
  </xs:element>

My hope was that it would allow elements to be specified in any order, but either this schema is broken or it doesn't look like that's how this works. (This is my first time ever writing an XML schema, so...) Is this a bug in xmlschema, or a bug in my XSD?

@ethindp
Copy link
Author

ethindp commented Jul 30, 2024

Okay, so I think this might be a bug with xmlschema after further tinkering. If I change it to:

  <xs:element name="muclient">
    <xs:complexType>
      <xs:sequence>
        <xs:element minOccurs="0" maxOccurs="unbounded" ref="include"/>
        <xs:choice minOccurs="0" maxOccurs="unbounded">
          <xs:element ref="plugin"/>
          <xs:element ref="world"/>
          <xs:element ref="triggers"/>
          <xs:element ref="aliases"/>
          <xs:element ref="timers"/>
          <xs:element ref="macros"/>
          <xs:element ref="variables"/>
          <xs:element ref="colours"/>
          <xs:element ref="keypad"/>
          <xs:element ref="printing"/>
        </xs:choice>
      </xs:sequence>
    </xs:complexType>
  </xs:element>

It still shows as having 1 minimum and maximum occurrence when evaluated:

>>> schema=xmlschema.XMLSchema11("..\\..\\mushclient-schemas\\mushclient.xsd")
>>> list(schema.elements["muclient"].iterchildren())
[Xsd11Element(ref='include', occurs=[0, None]), Xsd11Element(ref='plugin', occurs=[1, 1]), Xsd11Element(ref='world', occurs=[1, 1]), Xsd11Element(ref='triggers', occurs=[1, 1]), Xsd11Element(ref='aliases', occurs=[1, 1]), Xsd11Element(ref='timers', occurs=[1, 1]), Xsd11Element(ref='macros', occurs=[1, 1]), Xsd11Element(ref='variables', occurs=[1, 1]), Xsd11Element(ref='colours', occurs=[1, 1]), Xsd11Element(ref='keypad', occurs=[1, 1]), Xsd11Element(ref='printing', occurs=[1, 1])]

I could be wrong but this definitely looks like a bug to me.

@brunato
Copy link
Member

brunato commented Jul 30, 2024

Hi,

XsdElement.iterchildren() is not intended for validating the model (an instance of ModelVisitor is used for this task).

I will check the problem using that sample schema as a base:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="muclient">
    <xs:complexType>
      <xs:choice minOccurs="0" maxOccurs="unbounded">
        <xs:element name="include"/>
        <xs:choice>
          <xs:element name="plugin"/>
          <xs:element name="world"/>
          <xs:element name="triggers"/>
          <xs:element name="aliases"/>
          <xs:element name="timers"/>
          <xs:element name="macros"/>
          <xs:element name="variables"/>
          <xs:element name="colours"/>
          <xs:element name="keypad"/>
          <xs:element name="printing"/>
        </xs:choice>
      </xs:choice>
    </xs:complexType>
  </xs:element>
</xs:schema>

@ethindp
Copy link
Author

ethindp commented Jul 30, 2024

I didn't know that, good to know. I raised this as a bug because I need to be able to tell what elements are optional or not for code generation so I don't go assuming all elements are optional and then someone uses my tool and they run an XML document that is missing a (required) element and it works anyway. I thought that is_emptiable/is_single/is_multiple would do the trick here, but something is being mis-parsed. I have attached the full schema (what I posted was a segment). The .txt extension is just to make GH happy.
mushclient.xsd.txt

@ethindp
Copy link
Author

ethindp commented Aug 4, 2024

Any update on this?

@brunato
Copy link
Member

brunato commented Aug 6, 2024

Hi,
nothing wrong with validation here, with your schema all these instances are valid:

<?xml version="1.0" encoding="utf-8" ?>
<muclient xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:noNamespaceSchemaLocation="mushclient.xsd">
</muclient>

<?xml version="1.0" encoding="utf-8" ?>
<muclient xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:noNamespaceSchemaLocation="mushclient.xsd">
  <include name="foo"/>
</muclient>

<?xml version="1.0" encoding="utf-8" ?>
<muclient xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:noNamespaceSchemaLocation="mushclient.xsd">
  <include name="foo"/>
  <world port="80" id="1" name="http" site="example.test"/>
</muclient>

<?xml version="1.0" encoding="utf-8" ?>
<muclient xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:noNamespaceSchemaLocation="mushclient.xsd">
  <world port="80" id="1" name="http" site="example.test"/>
</muclient>

The only general mode for checking if an element is optional or not in a model group is to use a ModelVisitor instance and run on a children elements. This is done for XsdGroup validation thanks to a contribution that added a reordering method based on ModelVisitor. So probably an helper based on ModelVisitor can be built for checking what you ask for, but it's an enhancement, not a bug (not yet ... 😄 ).

As a starting point for doing this is to create a ModelVisitor instance from XSD group and then use its property expected to check which is the possible content that can be pushed, remembering that each choice can involve the following one. This content generation might be very customized on the needs of the user. This is the main reason that in the past drove me to refuse to add this feature to the package.

@brunato
Copy link
Member

brunato commented Aug 6, 2024

Anyway a generic method (e.g. XMLSchema.create_xml_data()) could be an option for the future taking count at least of two things:

  • maximum occurrences of elements (otherwise unbounded particles will be added without an ending)
  • a choice criteria of next element using an helper function provided by the user (for default the choice could be random between the expected elements)

@brunato
Copy link
Member

brunato commented Aug 6, 2024

An example to show how the model visitor can be used for that purpose:

>>> import xmlschema
>>> from xmlschema.validators import ModelVisitor
>>> from random import randint
>>> 
>>> schema = xmlschema.XMLSchema("mushclient.xsd")
>>> model = ModelVisitor(schema.elements['muclient'].type.content)
>>> 
>>> model.element
XsdElement(ref='include', occurs=[0, None])
>>>
>>> model.expected
[]
>>> while model.element is not None:
...     expected = model.expected
...     if not expected:
...         children.append(model.element)
...     else:
...         k = randint(1, len(expected)) - 1
...         children.append(expected[k])
...     if len(children) > 10:
...         break
...     assert not list(model.advance(True))
... 
>>> children
[XsdElement(ref='include', occurs=[0, None]), XsdElement(ref='colours', occurs=[1, 1]), XsdElement(ref='macros', occurs=[1, 1]), XsdElement(ref='colours', occurs=[1, 1]), XsdElement(ref='printing', occurs=[1, 1]), XsdElement(ref='colours', occurs=[1, 1]), XsdElement(ref='macros', occurs=[1, 1]), XsdElement(ref='variables', occurs=[1, 1]), XsdElement(ref='keypad', occurs=[1, 1]), XsdElement(ref='macros', occurs=[1, 1]), XsdElement(ref='world', occurs=[1, 1])]
>>> 

Anyway for letting this usable for all the models the ModelVisitor class needs an enhancement (e.g. look-ahead method that verify if a model can be stopped at a certain point).

@ethindp
Copy link
Author

ethindp commented Aug 6, 2024

@brunato Ah, okay. Is there a way I could use this from within my Ginja2 template, or would I need to write a custom filter?
To be clear: I'm not talking about validation. I'm talking about code generation via xmlschema.extras.codegen. That's where I'm trying to figure out if an element is optional or required, because I need to know when to complain about an element not being present from within the template and when to simply move on if the element is unavailable. I'm sorry if I confused you.

@brunato
Copy link
Member

brunato commented Aug 7, 2024

You have to write a custom filter. The current version of ModelVisitor is not fully suitable for this role (it was build for validation of an existing content, not for generating content). I will try to extend it's applicability, adding other methods and a wrapper, but I don't know yet if that will be included in a future release.

If you want to generate a valid content (a sequence of child elements) there is no way out of using a model validator that keeps track of the occurrences of the particles (elements, wildcards and groups) and their status (groups stack).
Otherwise the generated content could be invalid.

@ethindp
Copy link
Author

ethindp commented Aug 10, 2024

@brunato Yeah, it might be worth either extending the library with this functionality in the model visitor or adding this filter. I'm uncertain how to do this other than in the generator __init__ itself, where it builds a list of optional and required elements and stuff and then the filter operates based on that. It's doable but... A bit nasty, I'd say.

@ethindp
Copy link
Author

ethindp commented Aug 10, 2024

Also, it looks as though the model visitor still doesn't really detect optional elements. It sees the include element as being optional, but not any of the others, according to both the output you wrote and if I manually enumerate the model. The world one is required, which I need to affirm in the RelaxNG document, but the rest are optional and should, in theory, have occurrences of (0, 1) (0 to 1 occurrence). I definitely can't tell if this is just a limitation of the library or something else.

@brunato
Copy link
Member

brunato commented Aug 14, 2024

For checking if a particle of a model group is optional I will add a new method to XsdGroup objects:

    def is_optional(self, particle: ModelParticleType) -> bool:
        """
        Returns `True` if a particle can be optional in the model. This a raw check,
        because the optionality can depend on the presence and the position of other
        elements, that can be checked only using a `ModelVisitor` instance. Raises an
        `XMLSchemaValueError` if the particle is not part of the model group.
        """
        if self.max_occurs == 0:
            raise XMLSchemaValueError("the model group is empty")

        groups = [self]
        iterators: List[Iterator[ModelParticleType]] = []
        particles = iter(self)

        while True:
            for item in particles:
                if item is particle:
                    if item.min_occurs == 0:
                        return True

                    for group in reversed(groups):
                        if group.min_occurs == 0:
                            return True
                        elif group.model == 'choice' and len(group) > 1:
                            return True
                    else:
                        return False

                if isinstance(item, XsdGroup):
                    if item.max_occurs == 0:
                        continue

                    groups.append(item)
                    iterators.append(particles)
                    particles = iter(item.content)
                    if len(iterators) > limits.MAX_MODEL_DEPTH:
                        raise XMLSchemaModelDepthError(self)
                    break
            else:
                try:
                    groups.pop()
                    particles = iterators.pop()
                except IndexError:
                    msg = "The provided particle is not part of the model group"
                    raise XMLSchemaValueError(msg)

brunato added a commit that referenced this issue Sep 10, 2024
  - Related to issue #410: a check of optionality of particles in a
    model group.
@brunato
Copy link
Member

brunato commented Sep 19, 2024

Hi @ethindp,
v3.4 releases include extensions for ModelVisitor that should accomplish the basic tools for building a content generator (in the manner that one needs).
If you are still interested on that you can checkout these extensions and then eventually give a feedback.

thank you

@brunato brunato added the enhancement New feature or request label Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants