Petrol consumption prediction by Random Forest.
import pandas as pdimport numpy as np
import matplotlib.pyplot as plt
data=pd.read_csv("D:\\Raj_DataScience\\Documents\\petrol_consumption.csv")
data.shape
data.head()
| Petrol_tax | Average_income | Paved_Highways | Population_Driver_licence(%) | Petrol_Consumption | |
|---|---|---|---|---|---|
| 0 | 9.0 | 3571 | 1976 | 0.525 | 541 |
| 1 | 9.0 | 4092 | 1250 | 0.572 | 524 |
| 2 | 9.0 | 3865 | 1586 | 0.580 | 561 |
| 3 | 7.5 | 4870 | 2351 | 0.529 | 414 |
| 4 | 8.0 | 4399 | 431 | 0.544 | 410 |
Attribute and labels:
x=data.iloc[:,0:4].values
y=data.iloc[:,4].values
Train and Test set:
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.2,random_state=0)
Feature scaling:
from sklearn.preprocessing import StandardScaler
sc=StandardScaler()
x_train=sc.fit_transform(x_train)
x_test=sc.transform(x_test)
Building Random forest:
from sklearn.ensemble import RandomForestRegressor
regressor=RandomForestRegressor(n_estimators=50,random_state=0)
regressor.fit(x_train,y_train)
y_pred=regressor.predict(x_test)
Model Evaluation:
from sklearn import metrics
print("Mean Absolute Error:",metrics.mean_absolute_error(y_test,y_pred))
print("Mean Squared Error:",metrics.mean_squared_error(y_test,y_pred))
print("Root Mean Squared Error:",np.sqrt(metrics.mean_absolute_error(y_test,y_pred)))
Mean Absolute Error: 49.222000000000016
Mean Squared Error: 3736.462600000001
Root Mean Squared Error: 7.015839222787251data=pd.DataFrame({"Actual":y_test,"Predicted":y_pred})
data
| Actual | Predicted | |
|---|---|---|
| 0 | 534 | 571.58 |
| 1 | 410 | 502.40 |
| 2 | 577 | 604.98 |
| 3 | 571 | 575.46 |
| 4 | 577 | 615.48 |
| 5 | 704 | 601.86 |
| 6 | 487 | 586.60 |
| 7 | 587 | 567.64 |
| 8 | 467 | 463.02 |
| 9 | 580 | 513.76 |

Comments
Post a Comment