Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AfPO_0000285 resource URL contains non-ASCII character #58

Open
jggatter opened this issue Jun 26, 2024 · 3 comments
Open

AfPO_0000285 resource URL contains non-ASCII character #58

jggatter opened this issue Jun 26, 2024 · 3 comments

Comments

@jggatter
Copy link

I've encountered an error when trying to parse the latest hancestro.owl via the Python package, pronto. Please see
fastobo/fastobo#68

Would HANCESTRO consider replacing the resource URL https://en.wikipedia.org/wiki/Efé_people with the URL-safe version, https://en.wikipedia.org/wiki/Ef%C3%A9_people?

@daniwelter
Copy link
Collaborator

@jggatter Thank you for raising this issue and sorry to hear our latest release is causing you parsing issues. As the property in question originates from AfPO rather than HANCESTRO, I've asked the AfPO team to look into implementing the fix.

@anitacaron
Copy link
Contributor

Hi @jggatter, haven't you had a problem with two dbpedia classes: https://dbpedia.org/page/Réunion and https://dbpedia.org/page/São_Tomé_and_Príncipe?

After fixing the issue in AfPO, I tested parsing using fastobo and got an error on the Réunion class. Then, I found issues with São Tomé and Príncipe, which are defined in HANCESTRO.

@jggatter
Copy link
Author

jggatter commented Jul 3, 2024

Thanks @daniwelter!

Hi @anitacaron, it seems like those two classes you mention appear in the hancestro.owl file before Efé people. From my original attempt, pronto errored out on me for Efé people and was unable to continue. I suppose the parsing might not happen in a linear order of file start to file end (probably parses as a tree). In any case, it doesn't surprise me that these other resources that contain non-ASCII characters caused the same issue.

In the fastobo issue I linked, the maintainer replies saying that they will add UTF-8 support. Not sure when this would be. Even still, they advise that ontologies use the ASCII character set to better assure support across operating systems.

I've had our organization revert to using a previous HANCESTRO version for now, so we're not blocked. I look forward to eventually upgrading to include the AfPO contributions!

Thanks,
James

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants