Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question About Recent Changes in Long Turtle Serializer in v7.1.3 #3062

Open
lsharma-202 opened this issue Feb 6, 2025 · 1 comment
Open

Comments

@lsharma-202
Copy link

Hi,

With the recent changes in serializers/longturtle.py, it seems that maintaining an empty prefix in a .ttl file is no longer possible. For example, the following:

PREFIX : <http://example.org/resource>

:A  
    rdf:type owl:Ontology  
.  

is now reformatted as:

<http://example.org/resource#A>  
    rdf:type owl:Ontology  
.  

I’d like to understand the motivation behind converting this to an N-Triples-like format and what benefits this change brings.

Additionally, would it be possible to support both rdf:type and a declarations at the same time? There are cases where I need to keep rdf:type in the object position, for instance: sh:property rdf:type. However, the Long Turtle serializer automatically replaces all occurrences of rdf:type with a, while also removing unused the PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> declaration. This results in a BadSyntax error since rdf is no longer defined, even though sh:property rdf:type is still present after serialization.

I hope this clarifies the issue. Looking forward to your insights!

Best,
Lokesh

@edmondchuc
Copy link
Contributor

Hi, thanks for reporting the issues you've found.

> With the recent changes in serializers/longturtle.py, it seems that maintaining an empty prefix in a .ttl file is no longer possible.

If this is the case, that's an oversight on my part, sorry! I'll have to look into this.

Likely, we just need to copy over the namespace bindings from the original store to the new store. This way, the original prefixes parsed in are preserved. Currently, we reassign the new store to the store variable, overwriting any prefixes defined in the original reading of the input. I'll have to validate all of this to be certain though.

store = to_canonical_graph(store)

> I’d like to understand the motivation behind converting this to an N-Triples-like format and what benefits this change brings.

Please take a look at the original PR for context #3008.

The technical answer is, we canonicalize the store (produce deterministic blank nodes for the graph closure), serialize to n-triples and sort it, and then read in the data into a new graph. We prevserve the blank node identifiers with skolemization and maintain the order of triples. The result is a deterministic ingestion of the triples into the graph, and thus, produces a deterministic serialization of the long turtle format ideal for version control systems like git.

> Additionally, would it be possible to support both rdf:type and a declarations at the same time?

This sounds like a bug! We will definitely look into this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants