Decision Tree

Pages: 5 (1211 words) Published: September 30, 2014
﻿What did you do?
To get a deeper insight into creating decision trees on the laptop, I started to inform myself about possible supporting tools that can be used. As I am using an Apple MacBook, I found out that the software “XMind” cannot just help for drawing decision trees, but also for developing flowing charts, mind maps or to-do-lists. I thought about using Microsoft Excel as a tool for sorting the data. However, I finally looked up the necessary information in the given table without using any automatic sorting function, as for me, it was easier to manually type the data into MS Excel.

After installing the software and reading the task description, I realized that the tool is pretty easy to use and that it is very helpful in structuring information, as I will explain later on in this write-up. When creating the decision tree I started with entering the existing data. By analyzing the data in a first view you can directly see that the first and last name does not have any influence on the loan grant respectively the loan amount, which seems to be self-explaining. It makes sense to start with the node with the highest number of different characteristics. This way the tree will become clearer. That’s why I started with the distinction of the age and afterwards chronologically with the loan type, the ability to pay and finally the past payment record. The loan amount that already includes the information whether a loan was granted (loan amount > 0 \$) or not (loan amount = 0 \$), was placed under each line of the tree. This results in a total of 72 paths to get to a loan amount as a consequence of the characteristics of the 4 named criteria.

However I quickly found out that the data set does not describe all of the 72 possible combinations of the criteria. Therefore, I used rational arguments to figure out a possible arguable solution that will be described in the next section of this write-up. This supplemented information can be recognized by the red font color of the loan amount.

What were the results?
As you can see on the attached parts of the decision tree it is not easy to get a holistic overview over all 72 strands. Therefore, I divided the tree into various parts. What can be seen on the first view of the results is the fact that the youngest age group from 0 to 29 years gets the lowest loans of all records. This might be due to the fact, that they do not have such a long working experience or stable live as older customers do. Especially the frivolous loan type has to be pointed out, as there is no chance for that age group to get any loan for that purpose. However, the only information that was given for that area was the loan amount of 0\$ for a good ability to pay and a good payment record. As this is the best characteristic for these two criteria, I assumed that all worse ones would get the same amount of 0\$.

If you have a look at the age group of “30 to 55 years” you can immediately see that the avergae loan amount is pretty much higher. However, you can determine that the ability to pay plays also a very important role to the amount of money that can be loaned. If you have a bad ability to pay it is not possible to get more than 5.000\$. This is also only the case if the person has a good past payment record. In the two other characteristics of that criterion, meaning poor and slow, there is no chance to get a loan with a bad ability to pay. I also added loan amount information in areas where it was missing. I just want to explain one case that belongs to the frivolous loan type, the good ability to pay and a good past payment records. As the other two characteristics of the past payment records already get 7.500\$ I assumed that a good one should get more, so I put in the amount of 10.000\$

Finally, I was wondering about the data that was given for the age group of 56 years and older, as this target group gets less money compared to the 30 to 55 one. This might be the reason, as a person...