Week 3 - Decision Trees

Jul 04, 2016 ../ml-classification

Intuition behind decision trees

Predicting loan defaults with decision trees

Example of bank, looks at a bunch of factors when deciding whether to loan mony:
Credit history
Loan term
Income
Can be represented as a "decision tree".

Intuition behind decision trees

Given input Xi = (credit = poor, income = high, term = 5 years), we traverse the decision tree T(Xi) to get the Y-hat i output.

Task of learning decision trees from data

Goal to learn decision tree that makes good predictions given input.
Quality metric for algorithm: classification error:
Number of bad predictions / total predictions
- 1 = very bad.
- 0 = perfect.
Learning the trees "perfectly" = NP-hard problem; there are exponentially large numbers of possible trees.
Need to approximate the tree; few simple algorithms that do that.
Course will cover: simple (greedy) algorithm.

Learning decision trees

Recursive greed algorithm (high-level outline)

High-level algorithm:
Start with empty tree (loans labelled as safe or risky)
Then, split the tree on some feature (aka credit):
- Excellent
- Fair
- Poor
- For each split, if all examples are excellent, then we can predict safe.
- Go to step 2 and continue spliting.
Need to know when to stop recursing: covered in rest of module.

Learning a decision stump

Roughly: after splitting, pick the majority case as risky or safe (y-hat).

Selecting best feature to split on

Better split = one that gives lowest classification error.
Need to compare: splitting on nothing vs splitting on other features and compare classification error.

When to stop recursing

Criteria:
If all data in the split has just one class.
Used up all the features in your dataset.

Using the learned decision tree

Making predicitions with decision trees

Once you have the tree, you just need to travese it for some input:

func predict(tree_node, input): if tree_node.is_leaf: return tree_node.majority_class else: next_node = tree_node.get_next_node(input) # Return input based on input data return predict(next_node, input)

Multiclass classification with decision trees

Majority class can still be used to predict result at leaf.
Probability can be predicting by finding fraction for one class:

``` safe = 3 risky = 10 danger = 18

danger_prob = 18 / (3 + 10 + 18) = 0.58 ```

Threshold splits for continuous inputs

With continuous inputs (eg income levels or age) you can't split on indiviual values of you'll get crazy overfitting (eg income at 30k is safe, but 34k is not etc), need to thresholding to come up with discreet "buckets" for your inputs.
Income buckets:

* < 30k * >= 30k & < 100k * > 100k

Picking the best threshold to split

Roughly: choose the set of splits with the lowest classification error.

Tags

Notes by Lex Toumbourou

Week 3 - Decision Trees

Intuition behind decision trees

Predicting loan defaults with decision trees

Intuition behind decision trees

Task of learning decision trees from data

Learning decision trees

Recursive greed algorithm (high-level outline)

Learning a decision stump

Selecting best feature to split on

When to stop recursing

Using the learned decision tree

Making predicitions with decision trees

Multiclass classification with decision trees

Threshold splits for continuous inputs

Picking the best threshold to split