# The Benefits and Drawbacks of a Binary Tree Versus a Bushier Tree

Topics: Tree, Decision tree, Trees Pages: 13 (2005 words) Published: March 22, 2013
Homework 3
4. Discuss the benefits and drawbacks of a binary tree versus a bushier tree. The structure of binary is simple than a bushier tree. Each parent node only has two child. It save the storage space. Besides, binary tree may deeper than bushier tree. The result record of binary may not very refine. 5. Construct a classification and regression tree to classify salary based on the other variables. Do as much as you can by hand, before turning to the software. Data： NO. 1 2 3 4 5 6 7 8 9 10 11 Staff Sales Management Occupation Service Gender Female Male Male Male Female Male Female Female Male Female Male Age 45 25 33 25 35 26 45 40 30 50 25 Salary \$48,000 \$25,000 \$35,000 \$45,000 \$65,000 \$45,000 \$70,000 \$50,000 \$40,000 \$40,000 \$25,000 Level Level 3 Level 1 Level 2 Level 3 Level 4 Level 3 Level 4 Level 3 Level 2 Level 2 Level 1

Candidate Splits for t=Root Node
Candidate Split 1 2 3 Left Child Node, tL Occupation = Service Occupation = Management Occupation = Sales Right Child Node, tR Occupation = {Management, Sales, Staff} Occupation = {Service, Sales, Staff} Occupation = {Service, Management, Staff}

4 5 6 7 8 9 10 11 12

Occupation = Staff Gender = Female Age 45

Values of the Components of the Optimality Measure =(s|t) for each candidate split, for the Split PL PR P(L=1|tL) P(L=2|tL) P(L=3|tL) P(L=4|tL) P(L=1|tR) P(L=2|tR) P(L=3|tR) P(L=4|tR) 2PLPR ∅(s|t)

Root Node

1 2 3 4 5 6 7 8 9

0.27 0.73

0.33

0.33

0.33

0.00

0.13

0.25

0.38 0.29

0.25

0.40

0.23

0.36 0.64 0.00 0.18 0.82 0.00 0.18 0.82 0.50 0.45 0.55 0.00 0.27 0.73 0.67 0.36 0.64 0.50 0.45 0.55 0.40 0.55 0.45 0.33

0.00 0.50 0.50 0.20 0.00 0.00 0.20 0.33 0.29 0.25 0.20

0.50 0.50 0.00 0.40 0.33 0.50 0.40 0.33 0.29 0.38 0.40

0.50 0.00 0.00 0.40 0.00 0.00 0.00 0.00 0.14 0.13 0.20

0.29 0.22 0.11 0.33 0.00 0.00 0.00 0.00 0.00 0.00 0.00

0.43 0.22 0.22 0.33 0.38 0.43 0.33 0.20 0.25 0.33 1.00

0.00 0.22 0.22 0.00 0.25 0.29 0.33 0.40 0.25 0.33 0.00

0.46 0.30 0.30 0.50 0.40 0.46 0.93 0.50 0.46 0.40 1.60

0.66 0.26 0.40 0.46 0.53 0.66 0.46 0.46 0.30 0.23 0.26

0.33 0.44 0.33 0.38 0.29 0.33 0.40 0.50 0.33 0.00

10 0.64 0.36 0.29 11 0.73 0.27 0.25 12 0.91 0.09 0.20

Optimality measure maximized to 0.66, when occupation="Management"(Left Branch), occupation="Service or Sales or Staff"(Right Branch) After the first split, left child has records 4,5,6,7, right child has records 1,2,3,8,9,10,11. Now we split the left child which has records 4,5,6,7. Candidate Split 5 6 7 10 Left Child Node, tL Gender = Male Age 35

Values of the Components of the Optimality Measure =(s|t) for each candidate split, for the Split PL PR P(L=1|tL) P(L=2|tL) P(L=3|tL) P(L=4|tL) P(L=1|tR) P(L=2|tR) P(L=3|tR) P(L=4|tR) 2PLPR ∅(s|t)

each candidate split, for decision node A

5 6 7

0.50 0.50 0.25 0.75 0.50 0.50

0.00 0.00 0.00 0.00

0.00 0.00 0.00 0.00

0.00 1.00

1.00 0.00

0.00 0.00 0.00 0.00

0.00

1.00

0.00

0.50

1.00

0.00 0.00 0.00

0.33 0.00 0.00

0.67 1.00 1.00

0.38 0.50 0.38

0.50 1.00 0.50

1.00 0.67

0.00 0.33

10 0.75 0.25

Optimality measure maximized to 1.00, when Gender="Male"(Left Branch), Gender="Female"(Right Branch) After this split, both left branch and right branch terminate to pure leaf node. The left child has records 4.6 which value="Level 3" and the right child has record 5,7 which value="Level 4". Now we split the right child of root node which has records 1,2,3,8,9,10,11. Candidate Split 1 3 Left Child Node, tL Occupation = Service Occupation = Sales Right Child Node, tR Occupation = {Sales, Staff} Occupation = {Service, Staff}

4 5 6 8 9 11 12

Occupation = Staff Gender = Female Age 45

Values of the Components of the Optimality Measure =(s|t) for each candidate split, for the Split PL PR P(L=1|tL) P(L=2|tL) P(L=3|tL) P(L=4|tL) P(L=1|tR) P(L=2|tR) P(L=3|tR) P(L=4|tR) 2PLPR ∅(s|t)

each candidate split, for decision node B

1 3 4 5 6 8 9...

Please join StudyMode to read the full document