SAKey: Scalable almost key discovery in RDF data
dc.contributor.author | Symeonidou, Danai | |
dc.contributor.author | Armant, Vincent | |
dc.contributor.author | Pernelle, Nathalie | |
dc.contributor.author | Sais, Fatiha | |
dc.contributor.funder | Science Foundation Ireland | en |
dc.date.accessioned | 2016-04-20T15:50:48Z | |
dc.date.available | 2016-04-20T15:50:48Z | |
dc.date.issued | 2014-10 | |
dc.date.updated | 2016-01-11T14:14:54Z | |
dc.description.abstract | Exploiting identity links among RDF resources allows applications to efficiently integrate data. Keys can be very useful to discover these identity links. A set of properties is considered as a key when its values uniquely identify resources. However, these keys are usually not available. The approaches that attempt to automatically discover keys can easily be overwhelmed by the size of the data and require clean data. We present SAKey, an approach that discovers keys in RDF data in an efficient way. To prune the search space, SAKey exploits characteristics of the data that are dynamically detected during the process. Furthermore, our approach can discover keys in datasets where erroneous data or duplicates exist (i.e., almost keys). The approach has been evaluated on different synthetic and real datasets. The results show both the relevance of almost keys and the efficiency of discovering them. | en |
dc.description.sponsorship | Science Foundation Ireland (Grant No. 12/RC/2289) | en |
dc.description.status | Peer reviewed | en |
dc.description.uri | http://iswc2014.semanticweb.org/ | en |
dc.description.version | Accepted Version | en |
dc.format.mimetype | application/pdf | en |
dc.identifier.citation | Symeonidou, D., Armant, V., Pernelle, N. and Sais, F. (2014) "SAKey: Scalable almost key discovery in RDF data", 13th International Semantic Web Conference, ISWC 2014. Riva del Garda, Trento, Italy, 19-23 October, 2014. Springer: The Semantic Web – ISWC 2014, pp. 33-49. DOI: 10.1007/978-3-319-11964-9_3 | en |
dc.identifier.doi | 10.1007/978-3-319-11964-9_3 | |
dc.identifier.endpage | 49 | en |
dc.identifier.isbn | 978-331911963-2 | |
dc.identifier.issn | 03029743 | |
dc.identifier.journaltitle | Lecture Notes in Computer Science | en |
dc.identifier.startpage | 33 | en |
dc.identifier.uri | https://hdl.handle.net/10468/2471 | |
dc.identifier.volume | 8796 | en |
dc.language.iso | en | en |
dc.publisher | Springer International Publishing | en |
dc.relation.ispartof | 13th International Semantic Web Conference, ISWC 2014. Riva del Garda, Trento, Italy, 19-23 October, 2014 | |
dc.relation.uri | http://link.springer.com/chapter/10.1007/978-3-319-11964-9_3 | |
dc.rights | © 2014 Springer International Publishing. The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-11964-9_3 | en |
dc.rights.uri | http://www.springer.com/gp/rights-permissions/obtaining-permissions/882 | en |
dc.subject | Data linking | en |
dc.subject | Identity links | en |
dc.subject | Keys | en |
dc.subject | OWL2 | en |
dc.subject | RDF | en |
dc.title | SAKey: Scalable almost key discovery in RDF data | en |
dc.type | Conference item | en |