0

I have multiples files with a lot of data and 19 columns. I am trying to to multiple for-loop and set it equal the first column, second etc. in the files.

import numpy as np
import glob
import pandas as pd

#

lat=np.zeros(90)
long=np.zeros(180)
indat=np.zeros(19)

#

file_in = glob.glob('filenames*.dat'). 
for a in range(140):
   for i in range (90):
       for j in range (180):
            df = pd.DataFrame()
            for f in file_in:
                cols = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18] #there are nineteen columns 
                indat = df.append(pd.read_csv(f, delimiter='\\s+', header=None, usecols=cols, skiprows=4), ignore_index=True)
                lat[i]=indat[0] # error here
                long[j]=indat[1]
               #updates some code here
                if i >=70:
                   dens[a,j,i-70]=indat[2]

It gave me this error: ValueError: setting an array element with a sequence.

Updates:

indat has 19 columns, many files but all the format is the same.

Sample indat

#columns
#0   1    2      3 ..... 19 
-90  0   2e-12  #just some number
-90  2   3e-12  #just some number
-90  4   4e-12  #just some number
...
-90  360 1e-12  #just some number  
-88  0   1e-11  #just some number
-88  2   2e-11  #just some number
-88  4   3e-11  #just some number
...
-88  360 4e-11  #just some number 
...
90   0   2.5e-12  #just some number
90   2   3.5e-11  #just some number
90   4   4.5e-12  #just some number
...
90   360 1.5e-12  #just some number 

EDIT: I clean the code up based on everyone suggestions

import numpy as np
import glob
import pandas as pd

file_in = glob.glob('filenames*.dat'). 
df = pd.DataFrame()
for f in file_in:
    cols = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]
    indat = pd.read_csv(f, delimiter='\\s+', header=None, usecols=cols, skiprows=4)

for a in range(140):
   for i in range (90):
       for j in range (180):
           lat[i]=indat[0] # error here
           long[j]=indat[1]
           if i >=70:
              dens[a,j,i-70]=indat[2]

  • Hi, perhaps this might be of interest https://stackoverflow.com/questions/4674473/valueerror-setting-an-array-element-with-a-sequence – IronMan Sep 19 '19 at 01:00
  • It is not clear what you are trying to achieve, I have few query in the code above 1. why are you reading each file multiple time in loop? 2. from each file you are only using 1st and 2nd column and you are assigning a series to array element which will replace values from other files – Dev Khadka Sep 19 '19 at 11:23
  • [Never call DataFrame.append or pd.concat inside a for-loop. It leads to quadratic copying.](https://stackoverflow.com/a/36489724/1422451) – Parfait Sep 19 '19 at 18:06

1 Answers1

0

you tried to assign a column (pandas series) indat[0] to an element of a numpy vector lat[i]

Also what the point of indat=np.zeros(19) when you override it to be a dataframe later?

What is the content of indat[0]?

This line of code

indat = df.append(pd.read_csv(f, delimiter='\\s+', header=None, usecols=cols, skiprows=4), ignore_index=True)

is basically same as

indat = pd.read_csv(f, delimiter='\\s+', header=None, usecols=cols, skiprows=4)

because df never changed, i.e. it is always an empty dataframe

Since the content of indat is unknown, it's difficult to fix your code. If you just want to make it run without error, I suggest to write

lat[i] = indat[0].values[0] # take the first value of the vector
long[i] = indat[1].values[0] # take the first value of the vector

It's good to take some tutorial on Numpy and Pandas since it can be very confusing without some basic understanding.

Jose Vu
  • 621
  • 5
  • 13