1

I have a dataset with three inputs and trying to predict next value of X1 with the combination of previous inputs values.

My three inputs are X1, X2, X3, X4.

So here I am trying to predict next future value of X1. To predict the next X1 these four inputs combination affect with:

X1 + X2 - X3 -X4

I wrote this code inside the class. Then I wrote the code to run the lstm . After that I wrote the code for predict value. Then it gave me this error. Can anyone help me to solve this problem?

my code:

def model_predict(data):
pred=[]
for index, row in data.iterrows():
    val = row['X1']
    if np.isnan(val):
        data.iloc[index]['X1'] = pred[-1]
        row['X1'] = pred[-1]
        f = row['X1','X2','X3','X4']
        s = row['X1'] - row['X2'] + row['X3'] -row['X4']
        val = model.predict(s)
        pred.append(val)
return np.array(pred)

After lstm code then I wrote the code for predict value:

pred = model_predict(x_test_n)

Gave me this error:

  ` ---> 5 pred = model_predict(x_test_n)

    def model_predict(data):
     pred=[]
  -->for index, row in data.iterrows():
        val = row['X1']
        if np.isnan(val):`     
   AttributeError: 'numpy.ndarray' object has no attribute 'iterrows'
team
  • 526
  • 6
  • 20
  • This error speaks for itself, you need to transform `numpy.ndarray` to `pandas.DataFrame` – Michael Jul 27 '19 at 19:39
  • @Michael O. 3 First of all thank you for the fast reply. I didn't get what you are saying. Can you explain little bit more with a code, if you are okay? – team Jul 27 '19 at 19:43
  • What is `x_test_n`, I guess it is `numpy.ndarray`? If you want to handle it as Pandas dataframe, you need to convert it first, for example, like it is described here: https://stackoverflow.com/questions/20763012/creating-a-pandas-dataframe-from-a-numpy-array-how-do-i-specify-the-index-colum – Michael Jul 27 '19 at 20:02
  • @MichaelO. It's my test set , to predict next value – team Jul 28 '19 at 16:47
  • @MichaelO. I wrote it as inside the class as pd dataframe and it gave me an error and code " data = pd.DataFrame(data=data[1:,1:],index=data[1,:])" , error "Must pass 2-d input" – team Jul 28 '19 at 17:05
  • In one place you have *X1 + X2 - X3 - X4*, but later *row['X1'] - row['X2'] + row['X3'] - row['X4']*. Make up your mind where there should be **+** and where **-**. – Valdi_Bo Jul 29 '19 at 16:31

1 Answers1

0

Apparenty, data argument of your function is a Numpy array, not a DataFrame. Data, as a np.ndarray, has also no named columns.

One of possible solutions, keeping the argument as np.ndarray is:

  • iterate over rows of this array using np.apply_along_axis(),
  • refer to columns by indices (instead of names).

Another solution is to create a DataFrame from data, setting proper column names and iterate on its rows.

One of possible solutions how to write the code without DataFrame

Assume that data is a Numpy table with 4 columns, containing respectively X1, X2, X3 and X4:

[[ 1  2  3  4]
 [10  8  1  3]
 [20  6  2  5]
 [31  3  3  1]]

Then your function can be:

def model_predict(data):
    s = np.apply_along_axis(lambda row: row[0] + row[1] - row[2] - row[3],
        axis=1, arr=data)
    return model.predict(s)

Note that:

  • s - all input values to your model - can be computed in a single instruction, calling apply_along_axis for each row (axis=1),
  • the predictions can also be computed "all at once", passing a Numpy vector - just s.

For demonstration purpose, compute s and print it.

Valdi_Bo
  • 30,023
  • 4
  • 23
  • 41
  • So I have to write it in inside the class isn't if? If It is I wrote it and it gave me an error " Must pass 2-d input" ,, I wrote the code is " data = pd.DataFrame(data=data[1:,1:],index=data[1,:])" – team Jul 28 '19 at 17:03
  • You wrote *data=data[1:,1:],index=data[1,:]*. It actually means: 1. Drop the first row and column from *data*. 2. The index should be just the same as the first column of your DataFrame. Do you really want this? I also noticed that you failed to pass column names, so default names will be consecutive numbers from 0 and your code still will break on an attempt to access a non-existing column. Another hint: Error message shows that you passed only a **vector** not an **array**. Make a test printout of your data and look whether it contains all needed columns and more than 1 row. – Valdi_Bo Jul 29 '19 at 06:19
  • Thank you for the fast reply, I can understand what you are saying. I am actually stuck with this code , and I can't go forward with this error. I am not having proper idea that how to change the code to run this. If you are okay to help me with the code , it will be really helpful to me go forward. – team Jul 29 '19 at 06:35
  • Before you call *model_predict*, make a printout of *data* and add a sample of it to your question. Then: 1. Decide which columns contain values for X1, X2, X3 and X4. 2. In *model_predict* create a DataFrame from *data*, passing column names (maybe the default index will be enough). 3. The loop using *iterrows* should run on rows from just this DataFrame. – Valdi_Bo Jul 29 '19 at 08:33
  • Two questions to your code: 1. Your loop contains *f = row['X1','X2','X3','X4']*. What is the use of this variable? 2. What is the content (and type) of *model*? – Valdi_Bo Jul 29 '19 at 08:40
  • And a hint to your code: If you refer to multiple columns from a row than column names should be in **double** brackets. Otherwise you will probably get an execution error. – Valdi_Bo Jul 29 '19 at 08:47
  • f = row['X1','X2','X3','X4'] These are my four inputs , and according to the four inputs I am trying to predict the X1 value. 2. Model I am using LSTM – team Jul 29 '19 at 15:57
  • Before writing this code I wrote my data inside the dataframe . data= pd.DataFrame(data,columns=['X1','X2','X3','X4']) pd.options.display.float_format = '{:,.0f}'.format – team Jul 29 '19 at 16:08
  • @ Valdi_Bo Thank you for your great answer, when I applied , it gave me this error " Error when checking input: expected lstm_7_input to have 3 dimensions, but got array with shape (1446, 4)" – team Jul 30 '19 at 05:15
  • Note that *s* is an array with one column (shape (n, 1), not (n, 4)). Maybe the above error message is from some other line of code? Check in documentation what should be the input in "set of predictions" case. Then prepare correct input, taking the pattern from my solution. Read also https://stackoverflow.com/questions/44704435, it should give you the clue. – Valdi_Bo Jul 30 '19 at 06:28