TY - JOUR
T1 - Towards Building an RDF-based Deep Document Model and Retrieval Augmented Generation System for Enhanced Question Answering with Large Language Models
AU - Jia, Runsong
AU - Zhang, Bowen
AU - Rodríguez-Méndez, Sergio J.
AU - Omran, Pouya G.
N1 - Publisher Copyright:
© 2024 Copyright for this paper by its authors.
PY - 2024
Y1 - 2024
N2 - Knowledge Graphs (KGs) are crucial for Retrieval-Augmented Generation (RAG), but traditional methods have limitations in capturing details and querying academic KGs. The challenges lie in identifying the appropriate KG type for RAG, such as a Metadata KG, and optimizing the integration of Large Language Models (LLMs) with KGs to enhance retrieval and generation. This paper introduces a novel framework combining the Deep Document Model (DDM) concept and a KG-enhanced Query Processing (KGQP) mechanism. DDM provides a comprehensive, hierarchical representation of academic papers using advanced Natural Language Processing (NLP) techniques, while KGQP optimizes complex queries using the KG’s structural information and semantic relationships. The framework also integrates KGs with state-of-the-art LLMs to improve knowledge utilization and downstream task performance. Evaluations show that the KG-based approach surpasses vector-based methods in relevance, accuracy, completeness, and readability. This research demonstrates the potential of combining KGs and LLMs for effective academic knowledge management and discovery. § Submission type: Poster §.
AB - Knowledge Graphs (KGs) are crucial for Retrieval-Augmented Generation (RAG), but traditional methods have limitations in capturing details and querying academic KGs. The challenges lie in identifying the appropriate KG type for RAG, such as a Metadata KG, and optimizing the integration of Large Language Models (LLMs) with KGs to enhance retrieval and generation. This paper introduces a novel framework combining the Deep Document Model (DDM) concept and a KG-enhanced Query Processing (KGQP) mechanism. DDM provides a comprehensive, hierarchical representation of academic papers using advanced Natural Language Processing (NLP) techniques, while KGQP optimizes complex queries using the KG’s structural information and semantic relationships. The framework also integrates KGs with state-of-the-art LLMs to improve knowledge utilization and downstream task performance. Evaluations show that the KG-based approach surpasses vector-based methods in relevance, accuracy, completeness, and readability. This research demonstrates the potential of combining KGs and LLMs for effective academic knowledge management and discovery. § Submission type: Poster §.
KW - Deep Document Model
KW - Information Extraction
KW - Knowledge Graph
KW - Knowledge Graph Construction
KW - Large Language Model
UR - http://www.scopus.com/inward/record.url?scp=85210259392&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85210259392
SN - 1613-0073
VL - 3828
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
T2 - ISWC 2024 Posters, Demos and Industry Tracks: From Novel Ideas to Industrial Practice, ISWC-Posters-Demos-Industry 2024
Y2 - 11 November 2024 through 15 November 2024
ER -