Application of Data Mining in Medical Applications

Only available on StudyMode
  • Download(s) : 140
  • Published : November 15, 2010
Open Document
Text Preview
Application of Data mining in Medical Applications
by Arun George Eapen A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Applied Science in Systems Design Engineering

Waterloo, Ontario, Canada, 2004 ©Arun George Eapen 2004

AUTHOR’S DECLARATION

I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, including any required final revisions, as accepted by my examiners.

I understand that my thesis may be made electronically available to the public.

ii

Abstract
Data mining is a relatively new field of research whose major objective is to acquire knowledge from large amounts of data. In medical and health care areas, due to regulations and due to the availability of computers, a large amount of data is becoming available. On the one hand, practitioners are expected to use all this data in their work but, at the same time, such a large amount of data cannot be processed by humans in a short time to make diagnosis, prognosis and treatment schedules. A major objective of this thesis is to evaluate data mining tools in medical and health care applications to develop a tool that can help make timely and accurate decisions. Two medical databases are considered, one for describing the various tools and the other as the case study. The first database is related to breast cancer and the second is related to the minimum data set for mental health (MDS-MH). The breast cancer database consists of 10 attributes and the MDS-MH dataset consists of 455 attributes. As there are a number of data mining algorithms and tools available we consider only a few tools to evaluate on these applications and develop classification rules that can be used in prediction. Our results indicate that for the major case study, namely the mental health problem, over 70 to 80% accurate results are possible. A further extension of this work is to make available classification rules in mobile devices such as PDAs. Patient information is directly inputted onto the PDA and the classification of these inputted values takes place based on the rules stored on the PDA to provide real time assistance to practitioners.

iii

Acknowledgment
My deepest gratitude and appreciation goes to Professor Kumaraswamy Ponnambalam and Professor Jose Arocha for their guidance, patience, support and encouragement throughout my study at the University of Waterloo, which led to this thesis.

I would like to thank my thesis readers, Professory Bovas Abraham and Professor Hamid Tizhoosh for reviewing my thesis and providing knowledgeable comments and suggestions.

My sincere appreciation goes to Professor Romy Shioda and Professor James Hirdes for their suggestions and helpful assistance during the experimental stages of this thesis. My thanks to the department of Systems Design and especially Ms. Vicky Lawrence for her patience and help provided.

I would like to thank my parents, brother, sister and especially my aunt, Ms. Annama Abraham for their undying prayers, love, encouragement and moral support. ‘Thank you’ Mom and Dad for standing behind me and encouraging me always to take a step forward, you are the greatest people in the world. Last but not least I want to thank all my friends and colleagues both in India and in Waterloo who stayed by me throughout this period of time constantly encouraging me to work hard and at the same time who made my stay and work at the University of Waterloo a very pleasurable one.

iv

Table of Contents
Chapter 1 Introduction…………………………………………………………………… 1 1.1 1.2 1.3 Motivations….………………………………………………………….… 3 Goals and Objectives …………………………………………………….. 4 Thesis Outline…………………………………………………………….. 5

Chapter 2 Background and Literature Review………………………………………………6 2.1 Machine Learning……………………………………………………………….7 2.1.1 Knowledge Discovery in databases [KDD] and data mining…………9 2.1.2 The KDD Process……………………………………………………. 10...
tracking img