0

I am trying to predicts stock prices using python, while trying to reshape the dataset into a 2D num array for the 'fit' function by using this as reference : sklearn Logistic Regression "ValueError: Found array with dim 3. Estimator expected <= 2."

next_day_open_values, nx, ny = next_day_open_values.shape
next_day_open_values = next_day_open_values.reshape((next_day_open_values,nx*ny))
y_normaliser = preprocessing.MinMaxScaler()
y_normaliser.fit((np.expand_dims( next_day_open_values, -1 )))

I have come across this error:

    <ipython-input-42-6ea43c55dc18> in csv_to_dataset(csv_path)
     20 
     21     next_day_open_values, nx, ny = next_day_open_values.shape
---> 22     next_day_open_values = next_day_open_values.reshape((next_day_open_values,nx*ny))
     23     y_normaliser = preprocessing.MinMaxScaler()
     24     y_normaliser.fit((np.expand_dims( next_day_open_values, -1 )))

AttributeError: 'int' object has no attribute 'reshape'

What has gone wrong? How do I fix this? Detailed answers are appreciated.

The code so far is given below (I am using Jupyter notebook):

import keras
from keras.models import Model
from keras.layers import Dense, Dropout, LSTM, Input, Activation
from keras import optimizers
import numpy as np
np.random.seed(4)
import tensorflow
tensorflow.random.set_seed(4)
import pandas as pd
from sklearn import preprocessing
import numpy as np

history_points = 50

def csv_to_dataset(csv_path):
    data = pd.read_csv(csv_path)
    data = data.drop('Date', axis=1)
    data = data.drop(0, axis=0)
    data_normaliser = preprocessing.MinMaxScaler()
    data_normalised = data_normaliser.fit_transform(data)
    # using the last {history_points} open high low close volume data points, predict the next open value
    ohlcv_histories_normalised =      np.array([data_normalised[i  : i + history_points].copy() for i in range(len(data_normalised) - history_points)])
    next_day_open_values_normalised = np.array([data_normalised[:,0][i + history_points].copy() for i in range(len(data_normalised) - history_points)])
    next_day_open_values_normalised = np.expand_dims(next_day_open_values_normalised, -1)

    next_day_open_values = np.array([data.iloc[:,0][i + history_points].copy() for i in range(len(data) - history_points)])
    next_day_open_values = np.expand_dims(next_day_open_values_normalised, axis=-1)

    next_day_open_values, nx, ny = next_day_open_values.shape
    next_day_open_values = next_day_open_values.reshape((next_day_open_values,nx*ny))
    y_normaliser = preprocessing.MinMaxScaler()
    y_normaliser.fit((np.expand_dims( next_day_open_values, -1 )))

    assert ohlcv_histories_normalised.shape[0] == next_day_open_values_normalised.shape[0]
    return ohlcv_histories_normalised, next_day_open_values_normalised, next_day_open_values, y_normaliser
#dataset
hlcv_histories, next_day_open_values, unscaled_y, y_normaliser = csv_to_dataset('AMZN1.csv')

test_split = 0.9 # the percent of data to be used for testing
n = int(ohlcv_histories.shape[0] * test_split)

# splitting the dataset up into train and test sets

ohlcv_train = ohlcv_histories[:n]
y_train = next_day_open_values[:n]

ohlcv_test = ohlcv_histories[n:]
y_test = next_day_open_values[n:]

unscaled_y_test = unscaled_y[n:]

Feel free to correct/edit this.

Thanks

Ardoise012
  • 1
  • 1
  • 2
  • Here `next_day_open_values, nx, ny = next_day_open_values.shape` you assigned `next_day_open_values` to an `int` (which I assume was previously an `ndarray`). Then you continue to try to use `next_day_open_values` as an `ndarray`. – Iguananaut Feb 10 '20 at 14:44
  • Why do you reassign `next_day_open_values` to one of its `shape` elements? – hpaulj Feb 10 '20 at 16:56

1 Answers1

1

You provide a huge amount of lines of code, but this actually sums to a single issue: You are extracting 3 integers from next_day_open_values, nx, ny = next_day_open_values.shape. Numpy's reshape expects an array as input, not an integer or single value.

Parameters: numpy.reshape(a, newshape, order='C')

a : array_like - Array to be reshaped.

I doubt you are trying to get a vector of a single integer repeating for the nx*ny shape. Furthermore, if you turn the input into an array, and perform the same operation, you'll run into ValueError because you cannot reshape an array of size 1 into a specific shape.

I believe this might work, but I don't know what next_day_open_values is:

next_day_open_values_s, nx, ny = next_day_open_values.shape
next_day_open_values = next_day_open_values.reshape(next_day_open_values_s,nx*ny)
Celius Stingher
  • 17,835
  • 6
  • 23
  • 53
  • next_day_open_values is the stock price for the very next day. I intend to predict the "open price" on the next day. And in regard to the solution you have provided : ValueError: Found array with dim 3. MinMaxScaler expected <= 2. We are back at square 1 – Ardoise012 Feb 10 '20 at 14:49
  • Can you print the output of `next_day_open_values.shape` after using the 2 lines of code I posted? – Celius Stingher Feb 10 '20 at 14:54
  • Unfortunately , no. This is the error I get after using the code from your most recent edit: : . <= . – Ardoise012 Feb 10 '20 at 14:57
  • I'm asking you to run `print(next_day_open_values.shape)` please, not the full code where `MinMaxScaler` is mentioned. – Celius Stingher Feb 10 '20 at 14:58
  • I cannot print the output. – Ardoise012 Feb 10 '20 at 15:04
  • I'm not sure I can't understand why, but try adding this extra line after the 2 lines I posted: `next_day_open_values = next_day_open_values.reshape(-1,1)` – Celius Stingher Feb 10 '20 at 15:16