-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should node attribute key validation be recursive #3863
Comments
Pinging @giovannipizzi and @greschd for discussion |
Yes, I think the validation should be recursive. As a side note: the behavior is different again for cases where the JSON serializer doesn't automatically convert to string. The following code raises an error when storing the node: from aiida.orm import Dict
class Test:
def __hash__(self):
return 0
d = Dict(dict={'a': {Test(): 2}}) # works
d.store() # "TypeError: keys must be a string" from json/encoder.py |
Your example is the expected behavior though. I delayed the validation through serialization to when the node is stored. This is for efficiency reasons, because as long as the node is not stored, attributes can be freely mutated. Validating the whole set of attributes every time is not efficient and only validating the changed attribute may not be sufficient. So therefore we chose to push validation down to just before storing the node. |
Ah, I see. Yeah, the problematic case is when it is just mutated silently, because that can go undetected. |
Indeed, we moved validation upon storing. I am pinging @ltalirz because we changed the validation code as it was very slow, so before we merge any change we should double check the performance impact. |
See also #733 for some reference on some decision choices |
If I remember correctly the biggest slowdown was that the entire set of attributes was cleaned in its entirely every time a single attribute was changed. When building up |
I agree that validation should be recursive upon store (I think there is still the flag to skip validation entirely if maximum performance is needed, right?). The original performance issue was noticed when looping over a few hundred structure files to import them into StructureData. |
Currently there is a validation for attribute keys to verify that they are strings. For example the following raises:
However, nested keys are not validated and are simply converted to strings once the node is stored and the entire attributes dictionary is serialized. Compare the following two examples:
The storing in the second case turned the integer key
0
into the string'0'
.The text was updated successfully, but these errors were encountered: