0

I am trying to iterate through a CSV file and create a numpy array for each row in the file, where the first column represents the x-coordinates and the second column represents the y-coordinates. I then am trying to append each array into a master array and return it.

import numpy as np 

thedoc = open("data.csv")
headers = thedoc.readline()


def generatingArray(thedoc):
    masterArray = np.array([])

    for numbers in thedoc: 
        editDocument = numbers.strip().split(",")
        x = editDocument[0]
        y = editDocument[1]
        createdArray = np.array((x, y))
        masterArray = np.append([createdArray])


    return masterArray


print(generatingArray(thedoc))

I am hoping to see an array with all the CSV info in it. Instead, I receive an error: "append() missing 1 required positional argument: 'values' Any help on where my error is and how to fix it is greatly appreciated!

SuperKogito
  • 2,998
  • 3
  • 16
  • 37
Dyland
  • 91
  • 1
  • 11
  • numpy.append, is not like list.append, is not an in-place operation. provide the pointer also numpy.append(ind, i) – Myjab Apr 05 '19 at 00:31
  • refer [this doc](https://docs.scipy.org/doc/numpy/reference/generated/numpy.append.html) – Myjab Apr 05 '19 at 00:32
  • Thanks very much for the comment. When I change masterArray = np.append([createdArray]) to np.append(masterArray, createdArray) all it returns is [ ]. Any suggestion on why this is now happening? – Dyland Apr 05 '19 at 00:34
  • check this [answer](https://stackoverflow.com/questions/22392497/how-to-add-a-new-row-to-an-empty-numpy-array) – Myjab Apr 05 '19 at 00:44
  • Don't use `np.append`. Use list append, and make the array at the end. – hpaulj Apr 05 '19 at 00:44
  • If I'm understanding what you're saying correctly, you mean to append to a list and then from the completed list I should make an array? – Dyland Apr 05 '19 at 00:45
  • repeat of https://stackoverflow.com/q/55524396 – hpaulj Apr 05 '19 at 00:48
  • 1
    @Dyland yes, that is generally a better way to do it. Best is to not do this at all and instead read the entire file into a numpy array to begin with or a pandas dataframe. – alkasm Apr 05 '19 at 00:49

1 Answers1

0

Numpy arrays don't magically grow in the same way that python lists do. You need to allocate the space for the array in your "masterArray = np.array([])" function call before you add everything to it.

The best answer is to import directly to a numpy array using something like genfromtxt (https://docs.scipy.org/doc/numpy-1.10.1/user/basics.io.genfromtxt.html) but...

If you know the number of lines you're reading in, or you can get it using something like this.

file_length = len(open("data.csv").readlines())

Then you can preallocate the numpy array to do something like this:

masterArray = np.empty((file_length, 2))

for i, numbers in enumerate(thedoc): 
    editDocument = numbers.strip().split(",")
    x = editDocument[0]
    y = editDocument[1]
    masterArray[i] = [x, y]

I would recommend the first method but if you're lazy then you can always just build a python list and then make a numpy array.

masterArray = []

for numbers in thedoc: 
    editDocument = numbers.strip().split(",")
    x = editDocument[0]
    y = editDocument[1]
    createdArray = [x, y]
    masterArray.append(createdArray)

return np.array(masterArray)
Fish11
  • 453
  • 3
  • 12
  • 1
    `genfromtxt` and `loadtxt` use the list append approach. Pretty well have to because they don't know ahead of time the number of rows. – hpaulj Apr 05 '19 at 01:15