-
Notifications
You must be signed in to change notification settings - Fork 20
DBNQA Dataset correction
Initially to the check the preciseness of the DBNQA dataset and to rectify the errors if any, We decoded all the templates in the dataset and checked them over the dbpedia endpoint here.
Out of 899 thousand queries 14697 queries returned an error message and 8395 queries had an empty result set.
Out of the 15k queries which returned error most of the queries had an issue because of the short uri "dbr:" . To correct the error queries we used a regular expression to remove double spaces and replace the short URI to full URI.
re.sub(r"dbr:([^\s]+)" , r"<http://dbpedia.org/resource/\1>" , q)
After the above mentioned changes were made the number of queries which had an error got reduced from 15k to ~300.
While most of the queries worked after making the above changes a few still had some issues with the ending closing bracket because of the inconsistency in the templates.
Incorrect formation: select distinct ?uri where{?uri rdf:type dbo:VideoGame . ?uri dbo:publisher <http://dbpedia.org/resource/C&E}>
Correct formation: select distinct ?uri where{?uri rdf:type dbo:VideoGame . ?uri dbo:publisher <http://dbpedia.org/resource/C&E> }
Data | 12k epochs | 36k epochs | 120k epochs |
---|---|---|---|
DBNQA | 72.4 | 75.2 | 75.7 |
Monument_300 | 76.7 | 76.8 | 76.8 |