AN INTELLIGENT FOLDER-BASED DOCUMENT CONVERSATIONAL SYSTEM USING LARGE LANGUAGE MODELS AND RETRIEVAL-AUGMENTED GENERATION


Nandini Gujarathi, Prof. Shivani Budhkar
Department of Master of Computer Applications, Progressive Education Society’s Modern College of Engineering, Pune, India
Abstract
Enterprise document collections have grown at a pace that outstrips the ability of traditional search tools to serve them effectively. Workers routinely spend considerable time hunting through folders, files, and archives because incumbent retrieval mechanisms match keywords rather than meaning, returning noise when query phrasing diverges even slightly from document vocabulary. Advances in neural language modelling and hybrid generation-retrieval architectures now make it feasi-ble to build systems that hold genuine conversations with large document stores, surfacing precise answers rather than ranked lists of potentially relevant files. The present study proposes and evaluates a document question-answering system whose distinguishing characteristic is awareness of the folder hierarchy in which documents reside. Supporting a broad range of office file formats, the system converts each document into a set of semantically rich numeric representations through a pipeline of extraction, cleaning, seg-mentation, and encoding steps. These representations populate a purpose-built search index. When a question arrives, the index is probed for the segments most semantically proximate to the query; those segments are assembled into a concise context passage and forwarded to a language model that synthesises a direct, cited reply. Controlled experiments show that the folder-aware config-uration substantially lifts both retrieval precision and answer accuracy over keyword baselines, demonstrating the maturity of hybrid generative-retrieval methods for workplace information management.
Keywords: Large Language Models, Retrieval-Augmented Generation, Document Chatbot, Semantic Search, Vector Databases, Knowledge Retrieval
Journal Name :
EPRA International Journal of Multidisciplinary Research (IJMR)

VIEW PDF
Published on : 2026-05-22

Vol : 12
Issue : 5
Month : May
Year : 2026
Copyright © 2026 EPRA JOURNALS. All rights reserved
Developed by Peace Soft