0

I have two large NumPy arrays and need to concatenate them, but NumPy function 'concatenate' eats up almost all of my memory, so I can't do anything further. And I was trying this variant:

X = np.zeros((x.shape[0], discourse_type.shape[-1]+x.shape[-1]))

for i in range(X.shape[0]):
    X[i][:x.shape[-1]] = x[i][0]
    X[i][x.shape[-1]:] = discourse_type[i][0]

But this variant also eats up almost all of my memory. So, maybe I can concatenate without consuming nearly all of the memory?

So, what I have done before:

vectorizer = CountVectorizer()
x = vectorizer.fit_transform(train['discourse_text'])

x = x.toarray()[:,np.newaxis,:]
shape = x.shape[-1]

def str_to_numeric(df, column):
    y = np.zeros((len(df.index), len(df[column].unique())))
    dict_types = {df[column].unique()[i]:i for i in range(len(df[column].unique()))}
    
    for i in df.index:
        y[i][dict_types[df[column][i]]] = 1
        
    return np.array(y)


discourse_type = str_to_numeric(train, 'discourse_type')[:, np.newaxis, :]
y = str_to_numeric(train, 'discourse_effectiveness')

# X = np.concatenate((discourse_type, x), axis=-1)


X = np.zeros((x.shape[0], discourse_type.shape[-1]+x.shape[-1]))

for i in range(X.shape[0]):
    X[i][:x.shape[-1]] = x[i][0]
    X[i][x.shape[-1]:] = discourse_type[i][0]

Maybe there is another variant to do something before?

  • "but NumPy function 'concatenate' eats up almost all of my memory". So here is problem, I can't use this function, because I have not so much free memory – daniltomashi Jul 09 '22 at 07:50
  • If you want to end up with an array of the size you're using, that's going to use up the same amount of memory regardless of how you go about creating it. There exist sparse array implementations, but they only save memory in certain circumstances (e.g. where most of the array is large blocks of zeros). Can you instead change your algorithm to not require the huge array? – Blckknght Jul 09 '22 at 08:18
  • Look, i have Neural Network and 2 columns with values to predict the third column. So, the first column consists sentences, that I need to replace by very long array with 0 and 1, and there is the second column that consists one of seven classes and I can only think about replace it by vector with zeros and one 1, that represent class. And then I can only think about concatenate this 2 arrays in one, to give in input to Neural Network. Maybe there is another variant, but I can't come up with another option. – daniltomashi Jul 09 '22 at 08:37

0 Answers0