Preview

part of speech tagging on amharic

Powerful Essays
Open Document
Open Document
3506 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
part of speech tagging on amharic
A SIMPLE

RULE-BASED

PART OF SPEECH
Eric

TAGGER

Brill *

Department of Computer Science
University of Pennsylvania
P h i l a d e l p h i a , P e n n s y l v a n i a 19104 brill~unagi.cis.upenn.edu ABSTRACT
Automatic part of speech tagging is an area of natural language processing where statistical techniques have been more successful than rule-based methods. In this paper, we present a simple rule-based part of speech tagger which automatically acquires its rules and tags with accuracy comparable to stochastic taggers. The rule-based tagger has many advantages over these taggers, including: a vast reduction in stored information required, the perspicuity of a small set of meaningful rules, ease of finding and implementing improvements to the tagger, and better portability from one tag set, corpus genre or language to another. Perhaps the biggest contribution of this work is in demonstrating that the stochastic method is not the only viable method for part of speech tagging. The fact that a simple rule-based tagger that automatically learns its rules can perform so well should offer encouragement for researchers to further explore rulebased tagging, searching for a better and more expressive set of rule templates and other variations on the simple but effective theme described below.
1. I N T R O D U C T I O N
There has been a dramatic increase in the application of probabilistic models to natural language processing over the last few years. The appeal of stochastic techniques over traditional rule-based techniques comes from the ease with which the necessary statistics can be automatically acquired and the fact t h a t very little handcrafted knowledge need be built into the system. In contrast, the rules in rule-based systems are usually difficult to construct and are typically not very robust.
One area in which the statistical approach has done particularly well is automatic part of speech tagging, assigning each word



References: the Second Conference on Applied Natural Language Processing, ACL, 136-143, 1988. Third Conference on Applied Natural Language Processing, AUL, 1992. 31-39, 1988. PAMI-8, No. 6, 742-749, 1986. 5. Francis, W. Nelson and Ku~era, Henry, Frequency analysis of English usage. Lexicon and grammar. Houghton Mifflin, Boston, 1982. Longman: London, 1987. 7. Green, B. and Rubin, G. Automated Grammatical Tagging of English. Department of Linguistics, Brown University, 1971. Proceedings of the PTth Annual Meeting of the Association for Computational Linguistics, 1989. In J. K. Skwirzinski, ed., Impact of Processing Techniques on Communication, Dordrecht, 1985. DARPA Speech and Natural Language Workshop, Morgan Kaufmann, 1989. DARPA Speech and Natural Language Workshop, Morgan Kaufmann, 1991.

You May Also Find These Documents Helpful