Decision Tree Classifier in practice

 Problem statement:

Predicting whether a patient having heart disease or not based on some independent variables like sex, age, blood vessels, different diagnosis and etc.,

import pandas as pd

import numpy as np

Importing data set:

df=pd.read_csv("D:\\Raj_DataScience\\Documents\\heart.csv")

df.head()

agesexcptrestbpscholfbsrestecgthalachexangoldpeakslopecathaltarget
063131452331015002.30011
137121302500118703.50021
241011302040017201.42021
356111202360117800.82021
457001203540116310.62021

df.shape

(303, 14)

Attributes and Labels:

x=df.drop('target',axis=1)

y=df['target']

df['target'].unique()

array ([0, 1])

Split the data into Train and Test:

from sklearn.model_selection import train_test_split

x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.2)

Implementing Decision Tree classifier:

from sklearn.tree import DecisionTreeClassifier

classifier=DecisionTreeClassifier()

classifier.fit(x_train,y_train)

y_pred=classifier.predict(x_test)

Model Evaluation:

from sklearn.metrics import confusion_matrix,classification_report,accuracy_score

print(confusion_matrix(y_test,y_pred))

print(classification_report(y_test,y_pred))

print(accuracy_score(y_test,y_pred))

[[19  7]
 [ 5 30]]
             precision    recall  f1-score   support

          0       0.79      0.73      0.76        26
          1       0.81      0.86      0.83        35

avg / total       0.80      0.80      0.80        61

0.8032786885245902

Comments