0

I'm trying to run a predictive RNN from this repo https://github.com/jgpavez/LSTM---Stock-prediction. "python lstm_forex.py"
It seems to be having trouble creating an empty Numpy array

The function giving me problems, starting with the line 'days', fourth from the bottom.

def read_data(path="full_USDJPY.csv", dir="/Users/Computer/stock/LSTM2/",
    max_len=30, valid_portion=0.1, columns=4, up=False, params_file='params.npz',min=False):

    '''
    Reading forex data, daily or minute
    '''
    path = os.path.join(dir, path)

    #data = read_csv(path,delimiter=delimiter)
    data = genfromtxt(path, delimiter=',',skip_header=1)
    # Adding data bu minute
    if min == False:
        date_index = 1
        values_index = 3
        hours = data[:,2]
    else:
        date_index = 0
        values_index = 1

    dates = data[:,date_index]
    print (dates)
    days = numpy.array([datetime.datetime(int(str(date)[0:-2][0:4]),int(str(date)[0:-2][4:6]),
                int(str(date)[0:-2][6:8])).weekday() for date in dates])
    months = numpy.array([datetime.datetime(int(str(date)[0:-2][0:4]),int(str(date)[0:-2][4:6]),
                int(str(date)[0:-2][6:8])).month for date in dates])

Gives the error...

Traceback (most recent call last):  
  File "lstm_forex.py", line 778, in <module>  
tick=tick  
  File "lstm_forex.py", line 560, in train_lstm  
train, valid, test, mean, std = read_data(max_len=n_iter, path=dataset, params_file=params_file,min=(tick=='minute'))  
  File "/Users/Computer/stock/LSTM2/forex.py", line 85, in read_data
int(str(date)[0:-2][6:8])).weekday() for date in dates])  
ValueError: invalid literal for int() with base 10: 'n'  

I've seen a similar problem that involded putting '.strip' at the end of something. This code is so complicated I don't quite know where to put it. I tried everywhere and got usually the same error 'has no attribute' on others. Now I'm not sure what might fix it.

Ant
  • 933
  • 2
  • 17
  • 33
  • What is the difference between s[0:-2][6:8] and s[6:8]? Imho none, unless you have a string of length 8 or 9. Anyhow, at least one of the characters 6 and 7 is not a number. – Mr. T Feb 25 '18 at 23:18

1 Answers1

1

You're trying to int() the string 'n' in your assertion. To get the same error:

int('n')

ValueError                                Traceback (most recent call last)
<ipython-input-18-35fea8808c96> in <module>()
----> 1 int('n')

ValueError: invalid literal for int() with base 10: 'n'

What exactly are you trying to pull out in that list comprehension? It looks like sort of a tuple of date information, but a bit more information about what you're trying to pull out, or comments in the code explaining the logic more clearly would help us get you to the solution.

EDIT: If you use pandas.Timestamp it may do all that conversion for you - now that I look at the code it looks like you're just trying to pull out the day of the week, and the month. It may not work if it can't cnovert the timestamp for you, but it's pretty likely that it would. A small sample of the CSV data you're using would confirm easily enough.

days = numpy.array(pandas.Timestamp(date).weekday() for date in dates]) months = numpy.array(pandas.Timestamp(date).month() for date in dates])

bubthegreat
  • 301
  • 1
  • 9
  • Thank you for this answer and telling me what you think this thing is doing. I actually didn't know. Your help with this problem is helping my incremental python education today. The data is by the minute USDJPY numbers. http://www.histdata.com/download-free-forex-historical-data/?/metatrader/1-minute-bar-quotes/USDJPY – Ant Feb 26 '18 at 00:18
  • The script is looking for a _params.npz file now so I guess this solved my error and is on to the next one. Now if I could just figure out why I need an npz params file in addition to my csv file. The journey continues. – Ant Feb 26 '18 at 01:47
  • It looks like by default it's looking for: params_file='params.npz' in the "read_data" function from forex.read_data - which is calling numpy.savez. It looks like the file it's looking for is specific to the numpy.savez function, so reading up on that might point you in the right direction. https://docs.scipy.org/doc/numpy/reference/generated/numpy.savez.html – bubthegreat Feb 26 '18 at 03:33