Multi-class classification:
Predicting Hand written digits by implementing Logistic regression algorithm..
In previous post, we have seen that the binary classification on Insurance data set.,but in this post we go through Multi-class classification. Here we have hand written digits data set, in this we have ten categories of digits [0,1,2,3,4,5,6,7,8,9].
import libraries:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import data set:
from sklearn.datasets import load_digits
df=load_digits()
dir (df)
['DESCR', 'data', 'images', 'target', 'target_names' ]
print first data ( index =0 )
df.data[0]
array([ 0., 0., 5., 13., 9., 1., 0., 0., 0., 0., 13., 15., 10.,
15., 5., 0., 0., 3., 15., 2., 0., 11., 8., 0., 0., 4.,
12., 0., 0., 8., 8., 0., 0., 5., 8., 0., 0., 9., 8.,
0., 0., 4., 11., 0., 1., 12., 7., 0., 0., 2., 14., 5.,
10., 12., 0., 0., 0., 0., 6., 13., 10., 0., 0., 0.])
Here we got first data point in array formation, by this we can't conclude which digit is it? For our better understanding we plot this array by using matplotlib.
plt.gray()
plt.matshow(df.images[0])
Split the data Train and Test :
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test=train_test_split(df.data,df.target,test_size=0.2)
Implementing Logistic regression:
from sklearn.linear_model import LogisticRegression
model=LogisticRegression()
model.fit(x_train,y_train)
model.predict(df.data[0:5])
array([0,1,2,3,4])
for i in range (5):
plt.matshow(df.images[i])
y_pred=model.predict(x_test)
y_pred
array([4, 8, 7, 4, 9, 9, 2, 2, 0, 7, 6, 2, 8, 8, 7, 4, 3, 5, 0, 6, 3, 3,
9, 1, 3, 2, 7, 7, 9, 1, 7, 3, 0, 9, 1, 9, 8, 7, 8, 8, 7, 9, 5, 1,
2, 8, 0, 6, 5, 9, 8, 6, 1, 2, 9, 3, 1, 9, 8, 2, 6, 2, 4, 3, 4, 5,
5, 8, 2, 2, 7, 3, 3, 4, 4, 6, 6, 7, 1, 0, 2, 0, 2, 3, 4, 9, 4, 7,
9, 7, 3, 1, 0, 2, 9, 3, 2, 4, 1, 8, 3, 4, 2, 8, 7, 6, 6, 7, 0, 4,
1, 1, 3, 9, 9, 5, 8, 0, 9, 5, 7, 6, 0, 8, 8, 5, 2, 9, 5, 0, 8, 9,
7, 8, 7, 8, 7, 3, 0, 8, 2, 7, 3, 9, 8, 7, 2, 2, 5, 9, 5, 3, 4, 4,
2, 9, 2, 8, 9, 3, 9, 3, 4, 2, 6, 1, 9, 2, 4, 7, 2, 1, 4, 6, 5, 2,
5, 0, 6, 9, 9, 6, 3, 4, 7, 4, 2, 6, 8, 8, 0, 9, 8, 4, 8, 2, 0, 8,
4, 7, 8, 4, 3, 6, 0, 6, 9, 1, 2, 3, 8, 9, 5, 4, 7, 1, 5, 6, 6, 3,
5, 1, 9, 8, 9, 7, 7, 8, 0, 2, 2, 6, 0, 9, 2, 5, 1, 0, 7, 1, 4, 1,
5, 6, 2, 6, 7, 4, 9, 6, 5, 8, 5, 6, 6, 9, 2, 4, 3, 3, 2, 9, 0, 0,
0, 0, 4, 6, 6, 6, 8, 8, 7, 5, 8, 0, 7, 9, 7, 1, 0, 5, 3, 1, 4, 1,
6, 7, 9, 5, 1, 1, 4, 0, 0, 0, 0, 8, 1, 7, 1, 4, 5, 2, 2, 1, 8, 5,
1, 8, 3, 6, 5, 2, 1, 5, 1, 8, 3, 1, 0, 9, 2, 6, 4, 8, 9, 5, 1, 9,
5, 3, 0, 3, 7, 3, 6, 2, 4, 5, 6, 0, 0, 6, 2, 6, 6, 6, 6, 0, 1, 4,
5, 6, 9, 1, 9, 5, 5, 3])
Confusion matrix :
from sklearn.metrics import confusion_matrix
cm=confusion_matrix(y_test,y_pred)
cm
array([[34, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 31, 0, 1, 0, 0, 0, 0, 0, 1],
[ 0, 1, 39, 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 31, 0, 0, 0, 0, 0, 1],
[ 0, 1, 0, 0, 32, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 32, 0, 0, 0, 1],
[ 0, 0, 0, 0, 0, 0, 39, 0, 0, 0],
[ 0, 0, 0, 0, 1, 0, 0, 34, 0, 1],
[ 0, 1, 1, 0, 0, 0, 0, 0, 38, 0],
[ 0, 0, 0, 0, 0, 1, 0, 0, 1, 38]], dtype=int64)
import seaborn as sns
plt.figure(figsize=(10,7))
sns.heatmap(cm,annot=True)
plt.ylabel('Truth')
plt.xlabel('Predicted')
Comments
Post a Comment