1

Suppose you have a data file which includes several data sets separated by the string "--" in the following format:

--
<x0_val> <y0_val>
<x1_val> <y1_val>
<x2_val> <y2_val>
--
<x0_val> <y0_val>
<x1_val> <y1_val>
<x2_val> <y2_val>
...

How can you read the whole file into an array of arrays so that you can plot all data sets afterwards to the same picture with a for loop looping over the outer array ?

genfromtxt('data.dat', delimiter=("--"))

gives lots of

Line #1550 (got 1 columns instead of 2)
tshepang
  • 12,111
  • 21
  • 91
  • 136
ritter
  • 7,447
  • 7
  • 51
  • 84
  • See: http://stackoverflow.com/questions/3518778/how-to-read-csv-into-record-array-in-numpy – oz123 Sep 11 '12 at 10:29
  • See [that](http://stackoverflow.com/questions/12223965/how-to-parse-block-data-from-a-text-file-into-an-2d-array-in-python/12227618#12227618) – Pierre GM Sep 11 '12 at 10:30
  • [What have you tried](http://whathaveyoutried.com)? You can use split via two delimiters (first `--`, then `' '`. – Andy Hayden Sep 11 '12 at 10:31

2 Answers2

1

I will update ...

I would first split the file into multiple files, which can reside in memory as objects or on the filesystems as new files.

You can locate the string -- with the module re.

Then you can use the link I posted above.

oz123
  • 27,559
  • 27
  • 125
  • 187
1

If you're 100% certain that you have no negative values in your file, you can try a quick:

np.genfromtxt(your_file, comments="-")

The comments="-" will force genfromtxt to ignore all the characters after -, which of course will give weird results if you have negative variables. Moreover, the result will be just a lump of your dataset in a single array

Otherwise, the safest route is to iterate on your file and store the lines that do not match -- in one list per block, something along the lines:

blocks = []
current = []
for line in your_file:
    if line.startswith("-"):
        blocks.append(np.array(current))
        current = []
    else:
        current += line.split()

You may have to get rid of the first block if empty.

You could also check a mmap based solution already posted.

Community
  • 1
  • 1
Pierre GM
  • 19,809
  • 3
  • 56
  • 67
  • No, with `commments` it just combines all data sets. It doesn't create an array of arrays. – ritter Sep 11 '12 at 11:15
  • Yes, it will jst make ahuge array of all your dataset concatenated. The `comments="-"` forces `np.genfromtxt` to skip the lines that have a `-` in them. Once again, bad idea if you have negative values. If you want an array of array, construct individual lists per block, then transform each list into an array. – Pierre GM Sep 11 '12 at 11:49