MEL: Metadata Extractor & Loader

Sergio J. Rodríguez Méndez, Pouya G. Omran, Armin Haller, Kerry Taylor

    Research output: Contribution to conferencePaperpeer-review

    1 Citation (Scopus)

    Abstract

    The metadata and content-based information extraction tasks from heterogeneous file sets are pre-processing steps of many Knowledge Graph Construction Pipelines (KGCP). These tasks often take longer than necessary due to the lack of proper tools that integrate several complementary extraction methods and properties to get a rich output set. This paper presents MEL, a Python-based tool that implements a set of methods to extract metadata and content-based information from unstructured information encoded in different source document formats. The results are generated as JSON files, which can: (a) optionally be stored in a document store, and (b) easily be mapped to RDF using a variety of tools such as J2RM. MEL supports more than 20 different file types, making it a versatile tool that aids pre-processing tasks as part of a KGCP based on comprehensive configurable settings.

    Original languageEnglish
    Number of pages5
    Publication statusPublished - 2021
    Event2021 International Semantic Web Conference Posters, Demos and Industry Tracks: From Novel Ideas to Industrial Practice, ISWC-Posters-Demos-Industry 2021 - Virtual, Online
    Duration: 24 Oct 202128 Oct 2021

    Conference

    Conference2021 International Semantic Web Conference Posters, Demos and Industry Tracks: From Novel Ideas to Industrial Practice, ISWC-Posters-Demos-Industry 2021
    CityVirtual, Online
    Period24/10/2128/10/21

    Fingerprint

    Dive into the research topics of 'MEL: Metadata Extractor & Loader'. Together they form a unique fingerprint.

    Cite this