Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

free_to_read as boolean or period #1

Closed
efc opened this issue Feb 5, 2015 · 21 comments
Closed

free_to_read as boolean or period #1

efc opened this issue Feb 5, 2015 · 21 comments

Comments

@efc
Copy link

efc commented Feb 5, 2015

In the Access License and Indicators doc (NISO RP-22-2015) it describes free_to_read as being valid either as a boolean (assumed to be true if it present) or a period (using start_date and/or end_date attributes to bound the period). In section 3.1 (page 5) it states, "The absence of both a start and end date indicates a permanent state of free-to-read access."

The examples currently only demonstrate free_to_read as a period, not the case where the period is missing and assumed permanent. Don't we have to cover both cases?

I am very new to JSON-LD, so please be skeptical of my input. I am learning this as I go along.

It seems to me that in order to accomplish what the ALI recommendations suggest in JSON we would have to also allow for an element that could carry the boolean meaning of free_to_read. This might look like:

"free_to_read": {
    "permanent": true
}

The period case would look like:

"free_to_read": {
    "start_date": "2014-12-02",
    "end_date": "2015-12-01"
}

Of course, there would then be no way to avoid, in JSON, the internally inconsistent possibility of:

"free_to_read": {
    "permanent": true,
    "start_date": "2014-12-02",
    "end_date": "2015-12-01"
}

This result cannot occur in the XML representation since the "permanent" option is assumed from the absence of any period attributes in the XML element.

Would it be possible to specify the JSON of free_to_read such that an empty dictionary would be legal and the implementors should assume it to also mean "permanent?" This might look like:

"free_to_read": {}

Once we know what the boolean case should look like, we need to define it clearly in JSON-LD. I'm not sure what is the right way to do that. My very limited experience would lead me to believe the niso-ali-1.0-json suggested would be sufficient for the implicit boolean case, but that we might need something like this for the explicit case:

{
    "@context": {
        "@vocab": "http://www.niso.org/schemas/ali/1.0/jsonld.json",
        "xsd": "http://www.w3.org/2001/XMLSchema#",
        "free_to_read": { "@type": "@id" },
        "license_ref": { "@type": "@id" },
        "uri": { "@type": "@id" },
        "permanent": { "@type": "xsd:boolean" },
        "start_date": { "@type": "xsd:date" },
        "end_date": { "@type": "xsd:date" }
    }
}

So I guess that this boils down to: can we have an implicit boolean state in the JSON representation which is defined as passing an empty dictionary?

@anarchivist
Copy link

Hi @efc:

So I guess that this boils down to: can we have an implicit boolean state in the JSON representation which is defined as passing an empty dictionary?

No - this mapping conflicts with the @context definition, and would lead to an RDF instantiation of a blank node.

For example, this given JSON-LD:

{
    "@context": {
        "@vocab": "http://www.niso.org/schemas/ali/1.0/jsonld.json#",
        "xsd": "http://www.w3.org/2001/XMLSchema#",
        "free_to_read": { "@type": "@id" },
        "license_ref": { "@type": "@id" },
        "uri": { "@type": "@id" },
        "permanent": { "@type": "xsd:boolean" },
        "start_date": { "@type": "xsd:date" },
        "end_date": { "@type": "xsd:date" },
      "dc": "http://purl.org/dc/terms/"
    },
  "dc:title": "Sample paper",
  "uri": "http://example.org/",
  "free_to_read": {}
}

is equivalent to these N-Quads:

_:b0 <http://purl.org/dc/terms/title> "Sample paper" .
_:b0 <http://www.niso.org/schemas/ali/1.0/jsonld.json#free_to_read> _:b1 .
_:b0 <http://www.niso.org/schemas/ali/1.0/jsonld.json#uri> <http://example.org/> .

@anarchivist
Copy link

IOW; I recommend being explicit here.

@efc
Copy link
Author

efc commented Feb 5, 2015

Thanks @anarchivist. So what you are saying is that because an empty dict {} is not equivalent to a "@type": "@id" as listed in the context definition of free_to_read we really can't allow the implicit approach in JSON. That make sense.

It seems even the RDF definition in the recommendation is a bit muddled. Here are two ways free_to_read is used in RDF on page 17:

<ali:free_to_read rdf:parseType="Resource">
    <!-- free to read period around X-Mas -->
    <ali:start_date rdf:datatype="xsd:date">2014-12-24</ali:start_date>
    <ali:end_date rdf:datatype="xsd:date">2014-12-31</ali:end_date>
</ali:free_to_read>

And on page 18:

<ali:free_to_read rdf:datatype="http://www.w3.org/2001/XMLSchema#boolean">true</ali:free_to_read>

Is it valid to allow the same free_to_read element to be one of two RDF types?

This may be at the heart of the problem, and if even the RDF model is self-contradictory, it would explain the trouble expressing it in JSON-LD.

@anarchivist
Copy link

So what you are saying is that because an empty dict {} is not equivalent to a "@type": "@id" as listed in the context definition of free_to_read we really can't allow the implicit approach in JSON

Correct.

Is it valid to allow the same free_to_read element to be one of two RDF types?

Yes, it would be valid, in terms of RDF, but it would make this harder to parse reliably.

@kjw
Copy link
Contributor

kjw commented Feb 6, 2015

At the moment I have written out the XSD and JSON schema to avoid a 'permanent' boolean completely. Here's the schema:

{
    "@context": {
        "@vocab": "http://www.niso.org/schemas/ali/1.0/jsonld.json#",
        "xsd": "http://www.w3.org/2001/XMLSchema#",
        "free_to_read": { "@type": "@id" },
        "license_ref": { "@type": "@id" },
        "uri": { "@type": "@id" },
        "start_date": { "@type": "xsd:date" },
        "end_date": { "@type": "xsd:date" },
      "dc": "http://purl.org/dc/terms/"
    }
}

We can write out a free_to_read with a start_date, or a start_date and an end_date. Someone wanting to mark content as 'permanently' free to read would set the free to read start_date to the date of availability of the content: issue date, publication date, whatever.

{
    "free_to_read": { "start_date": "2014-02-02" }
}

My concern with a permanent boolean is that free_to_read: {permanent: true} could be added at any time to existing records. It could easily be mistaken as a reasonable way of saying "we've just made this content free to read". But from anyone reading that metadata, it in fact can only be interpreted as "this content has always been free to read".

Given that we can reasonably encode a permanent free to read state by setting the start date to be the first date of availability, are we in fact providing an unnecessary feature with a permanent boolean?

@ckoscher
Copy link

ckoscher commented Feb 6, 2015

Supplying a permanent boolean may be the easier path for some,
particularly those OA publishers who want this to be a fixed
declaration. Whereas free_to_read with a start date can imply that it
was not free to read at some pont. Sure, one can match up the start date
with publication date but the declaration free-to-read=true may be the
precisely the desired explicit statement

Chuck

On 2/6/15 5:32 AM, Karl Jonathan Ward wrote:

At the moment I have written out the XSD and JSON schema to avoid a 'permanent' boolean completely. Here's the schema:

{
     "@context": {
         "@vocab": "http://www.niso.org/schemas/ali/1.0/jsonld.json#",
         "xsd": "http://www.w3.org/2001/XMLSchema#",
         "free_to_read": { "@type": "@id" },
         "license_ref": { "@type": "@id" },
         "uri": { "@type": "@id" },
         "start_date": { "@type": "xsd:date" },
         "end_date": { "@type": "xsd:date" },
       "dc": "http://purl.org/dc/terms/"
     }
}

We can write out a free_to_read with a start_date, or a start_date and an end_date. Someone wanting to mark content as 'permanently' free to read would set the free to read start_date to the date of availability of the content: issue date, publication date, whatever.

{
     "free_to_read": { "start_date": "2014-02-02" }
}

My concern with a permanent boolean is that free_to_read: {permanent: true} could be added at any time to existing records. It could easily be mistaken as a reasonable way of saying "we've just made this content free to read". But from anyone reading that metadata, it in fact can only be interpreted as "this content has always been free to read".

Given that we can reasonably encode a permanent free to read state by setting the start date to be the first date of availability, are we in fact providing an unnecessary feature with a permanent boolean?


Reply to this email directly or view it on GitHub:
#1 (comment)

@efc
Copy link
Author

efc commented Feb 6, 2015

I do not think that we should remove the permanent option altogether, since it is so clearly apart of the NISO recommendation. I was not on that committee, so I don't really know what all went into that discussion, but I trust it was thoroughly considered. If we come to the conclusion that this would be the best course, I could get in touch with one of the co-chairs for guidance.

@efc
Copy link
Author

efc commented Feb 6, 2015

I have another idea which would both allow the "permanent" designation and not require an explicit "permanent" element in JSON-LD. This proposal will work with the proposed JSON-LD and XSD definitions as @kjw has drafted them. It would also accommodate those who do not want to (or are not able to) provide a specific free_to_read date as noted by @ckoscher.

Tell me if this is crazy...

The ISO 8601 standard states that "values in the range [0000] through [1582] shall only be used by mutual agreement of the partners in information interchange". I have found a few cases (NISO on page 20, OpenID) where a date of zero is used to indicate uncertainty about the actual date.

What if we propose that a start_date of 0000-00-00 be used to indicate the "permanent" designation? This way there would never be an "empty" free_to_read element and there is no need for a boolean element either. The understanding of 0000-00-00 would simply be a convention, an understanding, and one that actually would make sense even to most people (and parsers) that encounter it (start date of zero sounds a lot like "it has always been free").

@anarchivist
Copy link

I would still veer towards needing a permanent statement of some variety.

@efc
Copy link
Author

efc commented Feb 7, 2015

@anarchivist, how would you interpret this case where someone includes both the permanent boolean and the period elements?

"free_to_read": {
    "permanent": true,
    "start_date": "2014-12-02",
    "end_date": "2015-12-01"
}

If there is a permanent designation then we will have to decide if it overrides the period, or if the presence of a period overrides the boolean. I could see good reasons for going either way on this.

@anarchivist
Copy link

In my opinion, permanent should override the end_date. You could say that effectively the presence of permanent=true could lead the interpretation end_date=null.

@kjw
Copy link
Contributor

kjw commented Feb 9, 2015

I can see a case for both permanent overriding the period, or one or both of the dates of the period overriding the permanent. It probably doesn't matter either way.

Use Cases

I still find permanent unnecessary. Out of the nine use cases listed in the NISO ALI recommendation, four mention free_to_read as a full or partial solution. Here they are:

  • 5.1 Use Case: End User Seeks to Discover, Identify, and Access Free-to-Read Items
  • 5.2 Use Case: End User Seeks to Know the Readability Status of an Item
  • 5.8 Use Case: Funding Agency Seeks to Track Compliance of Research Outputs to Open Access Mandates
  • 5.9 Use Case: Institution Seeks to Report on Open Access Compliance of Research Outputs

End User and Readability Status

Let's imagine the process of some discovery service or reader software for determining the free to read status of content:

  1. Find the start_date and end_date of all free_to_read periods
  2. For each period, if today's date falls into the period, consider the content free to read, else,
  3. If any period includes a permanent=true flag, consider the content free to read, else,
  4. Consider the content not free to read.

If we remove the possibility of permanent=true, we remove none of the features of this logic. However we do simplify it. What is important is that the consumer of the metadata does not have to compare the start_date against the publication date.

This logic applies to the first two use cases, 5.1 and 5.2.

Funder Mandates and Readability Status

Now for checking against funder mandates. These may have some idea of an embargo period from date of publication. It may be common to see a non-free to read period followed by an indefinite free to read status:

{
    published: "2014-06-01",
    free_to_read: { start_date: "2015-01-01"}
}

With understanding of the funder's mandate - an embargo period of no more than 6 months in this example - we would use the logic:

  1. Find the start_date and end_date of all free_to_read periods
  2. If there is a period with a start_date no later than today + 6 months, and without an end_date, consider the content as meeting the funder's mandate, else,
  3. If any period specifies permanent=true, consider the content as meeting the funder's mandate, else,
  4. Consider the content as not meeting the funder's mandate.

Again, by removing the possibility of permanent=true, all we do is make this logic more straight-forward by removing the third step. We do not remove any functionality from the perspective of checking funder mandate criteria. In these cases, the metadata consumer must always know the publication date of the content, regardless of whether we have a permanent=true feature.

Ambiguities

Finally there are ambiguities to consider, some of which have already been stated in previous comments.

  1. What does it mean when there are free_to_read periods and also a separate free_to_read permanent flag?

    free_to_read: [ {start_date: "2012-01-01", end_date: "2012-01-05"}, {permanent: true} ]

  2. What does it mean if a single free to read period specifies dates and the permanent flag?

    free_to_read: [ {end_date: "2012-01-01", permanent: true} ]

It is true there could be rules to govern priority in each of these cases, but each rule adds complexity that every metadata consumer and creator must understand and implement.

@cameronneylon
Copy link

Hi Eric

The working group co-chairs just had a chat about this so I'm hoping we can both explain the slight muddle and hopefully solve the practical problem for you by updating the recommended practice with an explicit JSON-LD example for guidance.

Chuck is correct that the intent behind allowing the declaration of free-to-read to carry the intent of permanent status was to make things simple, especially for small open access publishers who would be able to just add a static header to articles. As you note from the RDF example there is a subtlety in the distinction between the free-to-read element as a resource, whose presence implies a state vs free-to-read as a statement.

This was the result of the challenges in balancing simplicity with the ability to declare different combinations of availability in a way that could satisfy OA and subscription publishers both large and small, and to provide enough functionality while reducing the potential for contradictory or confusing statements. And to be honest because we were focussed on xml representations we missed the issue you raised. The challenges of encoding what is still a somewhat contested statement.

So our proposal is to recommend that in JSON-LD publishers should take the approach that Karl recommends and give a start date, rather than creating a new tag that creates ambiguities or potential contradictions. We will update the examples in the recommended practice to show this adopting Karl's examples. Does that address all of your issues?

It's great to have the input on this. Myself and Greg and Ed are keen to make this practical and useable so if that means continuing to iterate with you to iron out the issues then we're happy to do it. Also thanks to everyone else on the thread for your input and Karl in particular for working through the updated examples. Once we've got the documentation updated I'll pop back here to confirm that and with any luck we'll be able to close the issue.

Thanks

Cameron
(on behalf of the NISO-ALI WG co-chairs, myself, Greg Tananbaum and Ed Pentz)

@efc
Copy link
Author

efc commented Feb 11, 2015

While I am hardly the authority here, I can certainly live with this approach @cameronneylon. But I am concerned with the potential mismatch between JSON and XML.

From time to time records are translated from XML to JSON or vice-versa, and if the XML representation allows something that cannot be coded in JSON-LD, then we are leaving the resolution of the dilemma to individual coders who may arrive at different conclusions. To eliminate that possibility I would recommend that the possibility of an empty free_to_read option be eliminated from the other formats as well as from JSON-LD.

Another option might be that the recommendation document an understanding of what a start_date of 0000-00-00 means. If the recommendation stated that such a start date means the same thing as an empty free_to_read tag in XML, or one of the examples showed this zero date being used in a case where the specific date was not available, it might preserve the balance of simplicity you describe with a clear path for those implementing the JSON-LD option.

@kjw
Copy link
Contributor

kjw commented Feb 11, 2015

@efc - I can't speak to what @cameronneylon and the NISO-ALI WG have in mind, but the modified schemas I have provided, XSD and JSON-LD, both remove the possibility of an empty free_to_read. For the XSD, start_date is a required attribute of a free_to_read declaration.

Actually, the JSON-LD schema doesn't specify start_date as required, but that is only because I don't know of a way to do that. The intention is though that start_date is required, end_date is optional. It may be up to the documentation to make it clear that start_date is mandatory for JSON-LD, just as it is specified as mandatory in the XSD.

My modified JSON-LD does however remove the possibility of specifying free_to_read as a boolean.

@efc
Copy link
Author

efc commented Feb 11, 2015

@kjw, great. I think that consistency would be helpful. I had not noticed this in your work, Karl, thanks!

@kjw
Copy link
Contributor

kjw commented Feb 13, 2015

XSD and JSON schema on the NISO site have been updated to reflect the change discussed here. Closing this issue.

@Klortho
Copy link

Klortho commented May 1, 2015

What is wrong with having free_to_read, in the JSON, taking either a value of true, or an object that takes a required start_date and an optional end_date? That's the way it is described in appendix A.4.

I was the original author of the comment where we proposed a JSON-LD representation, and I was pretty new to JSON-LD at the time. I've since had (a little) bit more experience with it.

Allowing it to take either a boolean or an object would not be a problem with regards to its validity as JSON-LD, as long as you removed the @type value from it, which is wrong now, anyway.

The use case you don't address above, Karl, is making it extremely easy for a publisher that publishes only OA material to indicate "free to read", and giving it a true value accomplishes that.

@kjw
Copy link
Contributor

kjw commented May 1, 2015

I personally can't see it being any more difficult for publishers to put publication dates in free_to_read when they want to indicate a permanent free to read.

Is it much more difficult to specify "free to read from publication date"? If so, and so much so that it is worth reintroducing some sort of boolean alternative, then I'm happy to have changes made and have this pushed back as a suggested alteration / reversion to the NISO group (if they're listening?)

More generally - I don't know why the NISO recommendation is still out of sync with this repo, if it is. I was under the impression that the schemas and examples in this repository would be taken as canonical, surely meaning other documents should be brought in line.

Happy to add you to this repository as a contributor if you want.

@kjw
Copy link
Contributor

kjw commented May 1, 2015

@Klortho Ping. (Just in case, as for me, comments on closed issues aren't made noticeable in github.)

@Klortho
Copy link

Klortho commented May 6, 2015

@kjw, Sorry, I didn't get notifications for your comments; even the ping. I guess because it's closed. Because of that, I'll open a new issue to respond to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants