Decision Tree - intro

🌳 Decision Tree is a  map of the possible outcomes of a series of related choices. As name goes, it uses a tree-like model of decisions. They can be used either to drive informal discussion or to map out an algorithm that predicts the best choice mathematically.


A decision tree typically starts with a single node, which branches into possible outcomes. Each of those outcomes leads to additional nodes, which branch of into other possibilities, this gives a tree-like shape. Decision Tree has the fallowing constituents:

  1. Root node: The factor which is considered as root of the case.
  2. Decision node: The node with the one incoming edge and 2 or more outgoing edges.
  3. Leaf node: This is the terminal node with no outgoing edge. 
  4. Branches: branches are the connections between the nodes. they are represented as arrows. Each branch represents a response such as yer or no. 

                                                

Classification rules: classification rules are the cases in which all the scenarios taken into consideration and class variable is assigned to each. 
Class variable: Each leaf node assign a class variable. A class variable is the final output which leads to our final decision.

Lets consider a scenario whether players will play or not ? based on the weather conditions as like sunny, overcast, and rain.The below diagram shows that the decision tree classification for playing game.

As the decision tree now constructed,  starting from the root node we check the test condition and assign the control to one of the outgoing edges, and so the condition is again tested and a node is assigned. The decision tree is to be completed when all the test conditions lead to a leaf node. The leaf node contains the class labels, which vote in favor or against the decision.

Now you might think why did we start with the "Play" attribute at the root ? if you choose any other attribute the decision tree constructed will be different. For a particular set of attributes, there can be numerous different tree created. We need to choose the optimal tree which can be done by implementing CART  and ID 3 algorithms.

Advantages: 
  • Decision trees perform classification without requiring much computation
  • Decision trees are capable of handling both continuous and categorical variables.
  • Decision trees easy to implement. and provides clear indication of which fields are most important for prediction or classification.
  • It works effectively with non-linear data.
Disadvantages:
  • Decision trees can be computationally expensive to train.
  • Decision trees are less appropriate for estimation tasks where the goal is to predict the value of a continuous attribute.
  • Decision trees are prone to errors in classification problems with many class and a relatively small number of training examples.



Comments