Decision Trees Algorithm

The Decision Trees algorithm creates hierarchical structure of classification rules“ If ... Then ...” looking like a tree. To decide which type to assign for an object or situation, we need to answer the questions, standing in the branches of the tree, starting from the root. The questions look like this: “Is the value of the parameter A greater than Ơ?”. If the answer is positive, a pass to the right performs, if it is negative – to the left; then a question related to the new branch follows. The following table shows a set of training data that could be used to predict credit risk. In this example, fictionalized information about customers was generated, including their debt level, income level, what type of employment they had, and whether they represented a good or bad credit risk.

Input for Decision Trees Algorithm

Result of Decision Trees Algorithm

In this example, the Decision Tree algorithm might determine that the most significant attribute for predicting credit risk is debt level. The first split in the decision tree is therefore made on debt level. One of the two new nodes (Debt = High) is a leaf node containing three cases with bad credits and no cases with good credit. In this example, a high debt level is a perfect predictor of a bad credit risk. The other node (Debt = Low) is still mixed, containing three good credits and one bad credit case. The decision tree algorithm then chooses employment type as the next most significant predictor of credit risk. The split on employment type has two nodes indicating that self-employed people have a higher probability for bad credit etc. One of them is a leaf node and the other has two leaf nodes, which show that people with low income level are less probably to pay their credit. Created on the base of real data, the algorithm model can be used when you have to determine the credit risk for a customer. Beginning from the root of the tree, you have to answer to the question connected to the current node. In this way, following the appropriate nodes for the customer, you reach the leaf node which the examined candidate belongs to. This is, of course, a small example based on synthetic data, but it illustrates how the decision tree can use known attributes of the credit applicants to predict credit risk. In real life, each credit applicant would posses far more attributes, and the numbers of applicants would be much larger. When the scale of the problem expands, it is difficult for a person to manually extract the rules to identify good and bad credit risks. The classification algorithm can consider hundreds of attributes and millions of records before coming up with the decision tree that describes rules for credit risk prediction. B&M services has chosen Decision Trees algorithm for BI2M application, as it is the most popular technique for predictive modeling.

See examples of using Decision Trees module of BI2M.



Go to Menu