Petrol consumption prediction by Random Forest.
import pandas as pdimport numpy as np
import matplotlib.pyplot as plt
data=pd.read_csv("D:\\Raj_DataScience\\Documents\\petrol_consumption.csv")
data.shape
data.head()
Petrol_tax | Average_income | Paved_Highways | Population_Driver_licence(%) | Petrol_Consumption | |
---|---|---|---|---|---|
0 | 9.0 | 3571 | 1976 | 0.525 | 541 |
1 | 9.0 | 4092 | 1250 | 0.572 | 524 |
2 | 9.0 | 3865 | 1586 | 0.580 | 561 |
3 | 7.5 | 4870 | 2351 | 0.529 | 414 |
4 | 8.0 | 4399 | 431 | 0.544 | 410 |
Attribute and labels:
x=data.iloc[:,0:4].values
y=data.iloc[:,4].values
Train and Test set:
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.2,random_state=0)
Feature scaling:
from sklearn.preprocessing import StandardScaler
sc=StandardScaler()
x_train=sc.fit_transform(x_train)
x_test=sc.transform(x_test)
Building Random forest:
from sklearn.ensemble import RandomForestRegressor
regressor=RandomForestRegressor(n_estimators=50,random_state=0)
regressor.fit(x_train,y_train)
y_pred=regressor.predict(x_test)
Model Evaluation:
from sklearn import metrics
print("Mean Absolute Error:",metrics.mean_absolute_error(y_test,y_pred))
print("Mean Squared Error:",metrics.mean_squared_error(y_test,y_pred))
print("Root Mean Squared Error:",np.sqrt(metrics.mean_absolute_error(y_test,y_pred)))
Mean Absolute Error: 49.222000000000016
Mean Squared Error: 3736.462600000001
Root Mean Squared Error: 7.015839222787251
data=pd.DataFrame({"Actual":y_test,"Predicted":y_pred})
data
Actual | Predicted | |
---|---|---|
0 | 534 | 571.58 |
1 | 410 | 502.40 |
2 | 577 | 604.98 |
3 | 571 | 575.46 |
4 | 577 | 615.48 |
5 | 704 | 601.86 |
6 | 487 | 586.60 |
7 | 587 | 567.64 |
8 | 467 | 463.02 |
9 | 580 | 513.76 |
Comments
Post a Comment