Sencept: a Domain-Specific Textual Commonsense Concept Acquisition System

Only available on StudyMode
  • Topic: Knowledge, Common sense, Series and parallel circuits
  • Pages : 21 (6383 words )
  • Download(s) : 92
  • Published : April 15, 2013
Open Document
Text Preview
International Journal of Computer and Information Technology (ISSN: 2279 – 0764) Volume 01– Issue 02, November 2012

SenCept: A Domain-specific Textual Commonsense Concept Acquisition System Rushdi Shams1, M.S.A. Shahnawaz Chowdhury2, and S.M. Abu Saleh Shawon3 Computational Linguistics Lab Department of Computer Science University of Western Ontario, London, Ontario, N6A 5B7, Canada1 Department of Computer Science and Engineering Khulna University of Engineering & Technology (KUET) Khulna-9203, Bangladesh2,3 1 rshams@csd.uwo.ca , sajib014@gmail.com2, shawon16@gmail.com3

Abstract— In this paper, we report the development and the performance of SenCept that acquires textual commonsense concepts to offer better contextualization for the domain DC electrical circuits. It uses a commonsense knowledge-base built upon a linguistic relations framework comprising Clause Level Relations, Sentential Roles, and Rhetorical Relations of a domain-specific corpus. SenCept selects representative commonsense knowledge using several parameters like knowledge weight and average commonsensical distances among knowledge. To extract commonsense concepts for any given sentence, the system uses the latter and the mean of distances among normalized weights of the representative sentences. The system is tested with a set of 100 random domain-specific sentences that are also given to five human subjects. Results show that SenCept achieves a precision and recall of 71.43 and 51.77 percent, respectively with an F-score of 60.03 percent. Keywords- Commonsense knowledge, commonsense concept, knowledge acquisition, knowledge engineering.

commonsense but does not have the knowledge. Finally, in his youth, he has both knowledge and commonsense. Although they have subtle differences between them, commonsense is a type of knowledge. Knowledge varies in human but commonsense should not and it should be present commonly in us- it is what makes the identification of exact commonsense a difficult task. Identification and extraction of commonsense knowledge has been the center of attraction in natural language understanding [1] and personalized learning [2] over the last three decades. Textual commonsense knowledge is important to understand the context and discourse of the text [3]. For example, from the sentence “The sum of current flowing into a junction is equal to the sum of current out of the junction”, a reader can contextualize more precisely if he is aware of the concepts like current, electron flow, junction, branch and Kirchhoff’s law and recent research findings showed that such contextualization promotes personalized learning [4]. These concepts are called the textual commonsense concepts that are derived from the commonsense knowledge associated with the sentence. In this paper, we propose a system named SenCept that acquires domain-specific commonsense concepts from the commonsense knowledge associated with text of the domain DC Electrical Circuits. We use a commonsense knowledgebase, developed by five human subjects from the linguistic relations, namely clause level, sentence level and rhetorical level, of the text of a DC electrical circuit corpus [18]. Every knowledge in this knowledge-base has been weighted and we calculate the average distance among them. This denotes the statistical distance of one knowledge from the others due to the variation of commonsense present in them. Moreover, for any given sentence, whose textual commonsense concepts are to be acquired, we calculate its normalized weight [9]. Then, we select the relevant commonsense knowledge using statistical analysis on the normalized sentence weight and average distance of knowledge, and select the proper nouns in them. We tested SenCept with a sample of 100 random domain-

I.

INTRODUCTION

Shams et al. reported SenCept that acquires commonsense concepts from domain-specific texts using a corpus [1]. They reported that the system, having a Common Concept Rate (CCR)...
tracking img