TY - GEN
T1 - Automatic creation and analysis of a linked data cloud diagram
AU - Caraballo, Alexander Arturo Mera
AU - Nunes, Bernardo Pereira
AU - Lopes, Giseli Rabello
AU - Leme, Luiz André Portes Paes
AU - Casanova, Marco Antonio
N1 - Publisher Copyright:
© Springer International Publishing AG 2016.
PY - 2016
Y1 - 2016
N2 - Datasets published on the Web and following the Linked Open Data (LOD) practices have the potential to enrich other LOD datasets in multiple domains. However,the lack of descriptive information,combined with the large number of available LOD datasets,inhibits their interlinking and consumption. Aiming at facilitating such tasks,this paper proposes an automated clustering process for the LOD datasets that,thereby,provide an up-to-date description of the LOD cloud. The process combines metadata inspection and extraction strategies,community detection methods and dataset profiling techniques. The clustering process is evaluated using the LOD diagram as ground truth. The results show the ability of the proposed process to replicate the LOD diagram and to identify new LOD dataset clusters. Finally,experiments conducted by LOD experts indicate that the clustering process generates dataset clusters that tend to be more descriptive than those manually defined in the LOD diagram.
AB - Datasets published on the Web and following the Linked Open Data (LOD) practices have the potential to enrich other LOD datasets in multiple domains. However,the lack of descriptive information,combined with the large number of available LOD datasets,inhibits their interlinking and consumption. Aiming at facilitating such tasks,this paper proposes an automated clustering process for the LOD datasets that,thereby,provide an up-to-date description of the LOD cloud. The process combines metadata inspection and extraction strategies,community detection methods and dataset profiling techniques. The clustering process is evaluated using the LOD diagram as ground truth. The results show the ability of the proposed process to replicate the LOD diagram and to identify new LOD dataset clusters. Finally,experiments conducted by LOD experts indicate that the clustering process generates dataset clusters that tend to be more descriptive than those manually defined in the LOD diagram.
KW - Automatic clustering
KW - Community detection algorithms
KW - Domain identification
KW - Linked data cloud analysis
UR - http://www.scopus.com/inward/record.url?scp=84996565899&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-48740-3_31
DO - 10.1007/978-3-319-48740-3_31
M3 - Conference contribution
SN - 9783319487397
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 417
EP - 432
BT - Web Information Systems Engineering – WISE 2016 - 17th International Conference, Proceedings
A2 - Cellary, Wojciech
A2 - Wang, Jianmin
A2 - Mokbel, Mohamed F.
A2 - Wang, Hua
A2 - Zhou, Rui
A2 - Zhang, Yanchun
PB - Springer Verlag
T2 - 17th International Conference on Web Information Systems Engineering, WISE 2016
Y2 - 8 November 2016 through 10 November 2016
ER -