0

I am using pybrain to predict house prices in the house price dataset. I downloaded the dataset from below link: https://www.kaggle.com/apratim87/housingdata/data

I picked 6 columns to predict the price 'bedrooms','bathrooms','sqft_living','sqft_lot','floors','zipcode'

I took a neural network with 6 input units,1 hidden layer with 3 neurons and 1 unit in the output.

I have normalized the data. Code is as below:

 house_df = pd.read_csv("kc_house_data.csv")
 print(house_df.head())
df = house_df.dropna(axis=0)
df = df[(df != 0).all(1)]
df.reset_index(drop=True,inplace=True)
X_org=house_df[['bedrooms','bathrooms','sqft_living','sqft_lot','floors','zipcode']]
y_org=house_df[['price']]


scaler = Normalizer().fit(X_org)
X = scaler.transform(X_org)
target_scaler = preprocessing.MinMaxScaler()
y=target_scaler.fit_transform(y_org)

ds=SupervisedDataSet(X.shape[1],y.shape[1])
for i in range(len(X)):
    ds.addSample(X.iloc[i,:].values,y.iloc[i,:].values)

train,rest=ds.splitWithProportion(0.60)
test,validation=rest.splitWithProportion(0.50)

print('Training Set Size='+str(len(train)))
print('Test Set Size='+str(len(test)))
print('Validation Set Size='+str(len(validation)))

#creating a neural network
def buildNN(invar,hidden,out):
    net=buildNetwork(invar,hidden,out,hiddenclass=SigmoidLayer,outclass=SoftmaxLayer)
    trainer=BackpropTrainer(net,dataset=train,momentum=0.1,verbose=True,weightdecay=0.01)
    trn_err,val_err=trainer.trainUntilConvergence(dataset=train,maxEpochs=50)

    #trainer.trainOnDataset(trndata,500)
    tst,=plt.plot(trn_err,'b',label='Test Error')
    vali,=plt.plot(val_err,'r',label='Validation Error')
    plt.legend(handles=[tst, vali])
    plt.ylabel('Error')
    plt.xlabel('Number of Epochs')
    plt.show()
     #testing it on test data
     out=net.activateOnDataset(test).argmax(axis=1)
    test_error=percentError(out,test['price'])
    #on validation data
    out=net.activateOnDataset(validation).argmax(axis=1)
    vali_error=percentError(out,validation['price'])
    return  test_error,vali_error

print('Neaural network with 6 input, 3 hidden units, 1 output')
nn3_testerr,nn3_valierr=buildNN(6,3,1)

I get constant error and my program is not learning. Can you please suggest what might be the issue?

Total error: 0.441892335278 Total error: 0.441892335278 Total error: 0.441892335278 Total error: 0.441892335278 Total error: 0.441892335278 Total error: 0.441892335278 Total error: 0.441892335278 Total error: 0.441892335278 Total error: 0.441892335278 Total error: 0.441892335278 Total error: 0.441892335278 Total error: 0.441892335278 Total error: 0.441892335278

user1690117
  • 41
  • 1
  • 5

2 Answers2

0

Ideally, you need to have two dense layers and a higher number of neurons per each layer. Another important point to note is mean normalization of your feature matrix. Instead of Sigmoid, Try Relu or Elu as the activation function.

dhanush-ai1990
  • 325
  • 4
  • 20
0

there are lots of ways to improve NN performance

1) tweak the geometry (add layers, change layer size)

2) change the activation function

3) change step size/momentum

4) play around with data preprocessing

you should try all of these, and various combinations of all of these. From a quick glance, a single layer, with only three neurons won't be very rugged, so start there.

Can you get this network to converge on a simple example like xor?

Mohammad Athar
  • 1,953
  • 1
  • 15
  • 31