-2

I am trying to implement a regression model with Keras but having some problems. Aim is to forecast sales demand. I have little experience so I couldn't figure it out.

This is my complete code:

 import pandas as pds

    dataframeX = pds.read_csv('data.csv', usecols=[0, 1, 2])
    dataframeY = pds.read_csv('data.csv', usecols=[3])

    # 1
    import numpy as np
    seed = 7
    np.random.seed(seed)

    # 2
    from keras.models import Sequential
    from keras.layers import Dense, Activation, Dropout

    model = Sequential()
    model.add(Dense(20, input_dim=3, activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(15, input_dim=20, activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(10, input_dim=15, activation='relu'))
    model.add(Dropout(0.6))
    model.add(Dense(5, input_dim=10, activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(1))
    model.compile(loss='mean_squared_error', optimizer='adam')
    model.summary()

    # 3
    import keras
    tbCallBack = keras.callbacks.TensorBoard(log_dir='/tmp/keras_logs', write_graph=True)

    # 4
    model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
    model.fit(dataframeX.values, dataframeY.values, epochs=100, batch_size=50,  verbose=1, validation_split=0.3, callbacks=[tbCallBack])

and this is my sample data: ProductID,Day,Month,Sales 1,1,10,2374 1,2,10,2374 1,1,11,2374 1,5,01,2374 1,3,02,950 2,1,02,1900 3,6,01,7122 3,6,01,2374 3,6,02,2374 3,1,02,2374 3,2,02,7122 3,1,03,9496

When I run my code with that data (about to 3500 lines) result is something like: loss: 27320141.9345 - acc: 4.1477e-04 - val_loss: 44879605.6779 - val_acc: 0.0000e+00

Any model or simply suggestions to increase accuracy?

Thank you

  • 1
    Please try to spend some time on the ML basics before delving into coding: you are in a regression setting, thus accuracy is meaningless: https://stackoverflow.com/questions/48775305/loss-mean-squared-error-what-function-defines-accuracy-used-by-keras – desertnaut Feb 26 '18 at 14:41
  • I'm voting to close this question as off-topic because it comes from a fundamental misunderstanding of the modelling issues involved, which has already been addressed (see comment above). – desertnaut Mar 15 '19 at 12:21

1 Answers1

1

Your question cannot be answered without knowing the data. It depends very much on whether the data in the corresponding columns is considered as categorical data or as continuous. A simple example is the colour of a product and its price. The colour can be red, yellow or blue. The price is between 0-100, a product with a price of 10 is closer to a product with a price of 12 than a product with a price of 100, which cannot be transferred to colours. The price should be considered continuously and the color categorically. Here is a great tutorial on how to handle categorical data: http://pbpython.com/categorical-encoding.html

Good luck

mapeza
  • 401
  • 3
  • 7