Example: Weather Dataset
Dataset features: Weather (Sunny, Rainy, Overcast), Labels: Play or Don't
Play
- Step 1: Calculate entropy of the root node.
- Step 2: Split on "Weather" and calculate entropy for each subset.
- Step 3: Compute information gain for the split.
- Step 4: Choose the best split and continue recursively.
Outcome: Demonstrate the best split based on calculated information gain.
Example: Weather Dataset
We use a small dataset to demonstrate decision tree construction.
Weather |
Play |
Sunny |
No |
Sunny |
No |
Overcast |
Yes |
Rainy |
Yes |
Rainy |
Yes |
Rainy |
No |
Overcast |
Yes |
Sunny |
No |
Sunny |
Yes |
Rainy |
Yes |
Sunny |
Yes |
Overcast |
Yes |
Overcast |
Yes |
Rainy |
No |
Step 1: Root Node Entropy
Class distribution at the root node:
Entropy formula:
H=−∑i=1cpilog2(pi)
Root node entropy:
pYes=149,pNo=145
Hroot=−(149log2(149)+145log2(145))
Hroot≈0.940
Step 2: Entropy for Subsets
Subset: Sunny
Weather |
Play |
Sunny |
No |
Sunny |
No |
Sunny |
No |
Sunny |
Yes |
Sunny |
Yes |
Entropy:
pYes=52,pNo=53
HSunny=−(53log2(53)+52log2(52))≈0.971
Subset: Overcast
Weather |
Play |
Overcast |
Yes |
Overcast |
Yes |
Overcast |
Yes |
Overcast |
Yes |
Entropy:
pYes=1,pNo=0
HOvercast=0
Subset: Rainy
Weather |
Play |
Rainy |
Yes |
Rainy |
Yes |
Rainy |
No |
Rainy |
Yes |
Rainy |
No |
Entropy:
pYes=53,pNo=52
HRainy=−(53log2(53)+52log2(52))≈0.971
Step 3: Information Gain
Information Gain formula:
IG=Hroot−∑k=1nNNkHk
For "Weather":
IGWeather=0.940−(145⋅0.971+144⋅0+145⋅0.971)
IGWeather=0.940−0.694≈0.246
Step 4: Choosing the Best Split
Information gain for "Weather" is 0.246.
"Weather" is chosen as the root split.
Next Steps:
- Repeat the process recursively for each subset (Sunny, Overcast, Rainy).
- Stop when a stopping criterion is met (e.g., pure subset).
Conclusion
This example demonstrates:
- Calculating entropy for each subset.
- Computing information gain.
- Choosing the best split based on information gain.