2

I am new to Python, and I have a problem in dealing with multiple data files. I want to read multiple data files into multiple arrays, for example, I want to read data in 1c.txt to array c1, data in 2c.txt into c2, etc. And I tried the following code:

import numpy as np
for i in range(1,15):
     globals()['c%s' % i] = np.loadtxt(['%sc.txt' % i], usecols=(0,1,2))

But it prompted with IndexError: list index out of range, and I changed usecols=(0,1,2) to usecols=(0) it still didn't work, so I think something else must be wrong.

Also I found I could not use the code as: ['c%s' % i] to get the variable names as c1,c2, etc. And I have to add a globals() before the ['c%s' % i], but I don't know why.

Waiting online. Many thanks!

Aaron
  • 45
  • 1
  • 7
  • Hi dawg, thank you for your reply. But how to make it? – Aaron Jul 02 '14 at 21:55
  • If you start thinking of these files as spreadsheets then you might also think about pandas dataframes rather than dicts or numpy arrays. especially if it's not just numerical data. – Back2Basics Jul 02 '14 at 21:57
  • I found the problem, I should use np.loadtxt('%sc.txt' % i, usecols=(0,1,2)) instead of: np.loadtxt(['%sc.txt' % i], usecols=(0,1,2)) in my code. Thanks for Ajean and you all! – Aaron Jul 02 '14 at 22:15

2 Answers2

2

You should use a Python dict to hold a mapping to arrays:

import numpy as np

dict_of_arrays={}

for i in range(1,15):
    dict_of_arrays['c%i' % i]=np.array([1,2,3])

print dict_of_arrays  

Prints:

{'c11': array([1, 2, 3]), 'c13': array([1, 2, 3]), 'c9': array([1, 2, 3]), 'c8': array([1, 2, 3]), 'c14': array([1, 2, 3]), 'c12': array([1, 2, 3]), 'c3': array([1, 2, 3]), 'c2': array([1, 2, 3]), 'c1': array([1, 2, 3]), 'c10': array([1, 2, 3]), 'c7': array([1, 2, 3]), 'c6': array([1, 2, 3]), 'c5': array([1, 2, 3]), 'c4': array([1, 2, 3])}

Then access an individual array thus: dict_of_arrays['c11'] to access the data from file c11 as an example.

dawg
  • 98,345
  • 23
  • 131
  • 206
  • That does not change the basic concept that you use a dict to manage multiple data items -- don't try and make variable names by using `globals()` – dawg Jul 02 '14 at 22:16
  • would you tell me why? – Aaron Jul 02 '14 at 22:34
  • 1
    A) It is slower than using a local dict; B) It is confusing to others that read your code; C) It is sloppy; D) [It will bite you](http://www.diveintopython.net/html_processing/locals_and_globals.html) – dawg Jul 02 '14 at 22:38
  • Thank you! I wish I could mark two best answers! : ) – Aaron Jul 02 '14 at 22:48
0

Well, I can answer at least some of those questions.

I found I could not use the code as: ['c%s' % i] = np.loadtxt(['%sc.txt' % i]

That is because ['c%i' % i] will give you a list of strings, not variables. By doing globals()[string] you are accessing (assigning) to a dictionary (the globals() dictionary). I highly recommend NOT using globals()!

Do something like:

mydict = {}
for i in range(1,15):
    mydict['c%i' % i] = np.loadtxt('c%i.txt' % i, usecols=(0,1,2))

I also notice that you are using %s where you should be using %i in your formatting, %s is for strings but your variable i is an integer.

Ajean
  • 5,528
  • 14
  • 46
  • 69
  • Probably other can expound on it better than I can, but it has to do with the namespaces and how variables are scoped. It's a general programming practice that global variables are best avoided when possible. In addition, trying to work directly with the enormous dict that is globals() will just make you work ten times harder than you have to... – Ajean Jul 03 '14 at 00:10
  • And.... I just saw the excellent comments @dawg made on his just-as-useful answer. Exactly! – Ajean Jul 03 '14 at 00:11