SVM - Classifier ( Non linear ) in pracitce

 Problem Statement:

Classifying iris-data set by using SVM ??


In case of non-linearly separable data, the simple SVM algorithm cannot be used. Rather, a modified version of SVM, called kernel SVM.

Import data set:

from sklearn import datasets

iris=datasets.load_iris()

# print the label species (setosa,versicolor,virginica)

print(iris.target_names)

['Setosa', 'Versicolor', 'Virginica']

print the names of features:

print(iris.feature_names)

 Creating a Dataframe of given iris dataset:

data=pd.DataFrame({

    'sepal length':iris.data[:,0],

    'sepel width':iris.data[:,1],

    'petal length':iris.data[:,2],

    'petal width':iris.data[:,3],

    'species':iris.target

})

data.head(5)

x=data.drop('species',axis=1)

y=data['species']

Train & Test :

from sklearn.model_selection import train_test_split

x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.20)

To train the  kernel SVM, we use the same SVC class. The difference lies in the value for the kernel parameter of the SVC class. In the case of the simple SVM we used 'linear' as the value for the kernel parameter. However, for kernel SVM you can use Gaussian, Polynomial, Sigmoid kernel. We will implement polynomial, Gaussian, and Sigmoid kernels to see which one works better for our problem.

 polynomial kernel:

In the case of  Polynomial kernel, you also have to pass a value for the degree parameter of the SVC class. This is basically the degree of the polynomial.

from sklearn.svm import SVC

svcclassifier=SVC(kernel='poly',degree=8)

svcclassifier.fit(x_train,y_train)

y_pred=svcclassifier.predict(x_test)

Model Evaluation:

from sklearn.metrics import classification_report,confusion_matrix,accuracy_score

print(confusion_matrix(y_test,y_pred))

print(classification_report(y_test,y_pred))

print(accuracy_score(y_test,y_pred))

[[ 9  0  0]
 [ 0  7  0]
 [ 0  0 14]]
             precision    recall  f1-score   support

          0       1.00      1.00      1.00         9
          1       1.00      1.00      1.00         7
          2       1.00      1.00      1.00        14

avg / total       1.00      1.00      1.00        30

1.0

Gaussian kernel:

To use Gaussian kernel, you have to specify 'rbf ' as a value for the kernel parameters of the SVC class.

from sklearn.svm import SVC

svcclassifier=SVC(kernel='rbf')

svcclassifier.fit(x_train,y_train)

y_pred=svcclassifier.predict(x_test)

 Model Evaluation:

from sklearn.metrics import classification_report,confusion_matrix,accuracy_score

print(confusion_matrix(y_test,y_pred))

print(classification_report(y_test,y_pred))

print(accuracy_score(y_test,y_pred))

[[ 9  0  0]
 [ 0  7  0]
 [ 0  1 13]]
             precision    recall  f1-score   support

          0       1.00      1.00      1.00         9
          1       0.88      1.00      0.93         7
          2       1.00      0.93      0.96        14

avg / total       0.97      0.97      0.97        30

0.9666666666666667

Sigmoid kernel:

To use the Sigmoid kernel, you have to specify 'sigmoid'

from sklearn.svm import SVC

svcclassifier=SVC(kernel='sigmoid')

svcclassifier.fit(x_train,y_train)

y_pred=svcclassifier.predict(x_test)

Model Evaluation:

from sklearn.metrics import classification_report,confusion_matrix,accuracy_score

print(confusion_matrix(y_test,y_pred))

print(classification_report(y_test,y_pred))

print(accuracy_score(y_test,y_pred))

[[ 0  9  0]
 [ 0  7  0]
 [ 0 14  0]]
             precision    recall  f1-score   support

          0       0.00      0.00      0.00         9
          1       0.23      1.00      0.38         7
          2       0.00      0.00      0.00        14

avg / total       0.05      0.23      0.09        30

0.23333333333333334

Comparison of Kernel performance :

If we compare the performance of the different types of kernels we can clearly see that the Sigmoid kernel performs the worst, since sigmoid is more suitable for binary classification problems. Amongst the Gaussian and Polynomial kernel, we can see that both perform well. Here we got 100% accuracy with Polynomial kernel, in maximum number of cases 100% accuracy leads to over fitting. However there is no hard and fast rule to which kernel performs best in every scenario.











Comments