Naive Bayes in Practice

 Problem statement:

Classify whether players will play or not based on the weather conditions??

In this problem, you can use the dummy data set with three columns: 1.weather, 2.temperature, 3.play.  The first two are features and the third one is label(play).

 Assigning features and label variables:

weather =['Sunny','Sunny','Overcast','Rainy','Rainy','Rainy','Overcast','Sunny','Sunny','Rainy','Sunny','Overcast','Overcast','Rainy']

temp=['Hot','Hot','Hot','Mild','Cool','Cool','Cool','Mild','Cool','Mild','Mild','Mild','Hot','Mild']

play=['No','No','Yes','Yes','Yes','No','Yes','No','Yes','Yes','Yes','Yes','Yes','No']

 Encoding features:

Here, you need to convert these string labels into numbers, like 'overcast', 'Rainy', 'Sunny' as 0,1,2. This is known as label encoding.

 import labelencoder

from sklearn import preprocessing

le=preprocessing.LabelEncoder()

 converting string labels into numbers

weather_encoded=le.fit_transform(weather)

print(weather_encoded)

[2 2 0 1 1 1 0 2 2 1 2 0 0 1]

Similarly you can also encode temp and play columns.

converting string labels into numbers

temp_encoded=le.fit_transform(temp)

print(temp_encoded)

[1 1 1 2 0 0 0 2 0 2 2 2 1 2]

label=le.fit_transform(play)

print(label)

[0 0 1 1 1 0 1 0 1 1 1 1 1 0]

combining weather and temp into single list of tuple:

features=zip(weather_encoded,temp_encoded)

print(features)

[(2,1),(2,1),(0,1),(1,2),(1,0),(1,0),(0,0),(2,2),(2,0),(1,2),(2,2),(0,2),(0,1),(1,2)]

 import Gaussian Naive Bayes model:

from sklearn.naive_bayes import GaussianNB

create a Gaussian classifier

model=GaussianNB()

Train the model using the training sets

model.fit(features,label)

# predict output:

predicted=model.predict([[0,2]])

print("predicted value:",predicted)

Predicted value: [1]

Here, 1 indicates that players can play.

Comments