Knowledge Based Document Management System for Free-Text Documents Discovery

Paul D Manuel1, Mostafa Ibrahim Abd-El Barr1, Thamarai Selvi2
1Department of Information Science, College for Women Kuwait University, Kuwait.
2Department of Information Technology, Madras Institute of Technology, Anna University, India.

Abstract

A Knowledge Based Document Management System (KBDMS) is proposed in this paper to organize, cluster, classify and discover free-text documents. Context sensitive information is discovered by means of word map, sentence map and paragraph map in an intelligent manner in this proposed system. A text learning procedure for the semantic retrieval of text documents is implemented using a hierarchy of self-organizing maps (SOM) and support vector machines (SVM). The hierarchical SOM generates histograms of paragraph maps based on the semantic similarity and these paragraph maps are trained using SVM for classification. The SVM also generates an index for each document given to it. The proposed system is scalable and capable of discovery of documents from a huge amount of free-text documents. It is tested over a maximum of 100,000 text documents with 75-80\% accuracy in the context-sensitive discovery of free-text documents.

Keywords: knowledge based document management system, context-sensitive discovery of free-text document, self organizing maps, semantic similarity, support vector machines.