3

I have a numpy array of shape (444,445), and I need to dump it as a csv file. One can achieve this by:

np.savetxt('outfile.txt',array, delimiter=',',fmt="%s")

I use the fmt="%s" option, because at the end of each row (the 444 element of the array, is NaN).

What I would like to accomplish is to write a csv file that is 5 column wide, with 39,516 total lines (that is, 89 sections each of which consist of 5 columns and 444 lines), and finally the NaN written as an empty element on the end of the 444th line. In this way, one has the number of elements of the matrix that are equivalent: 89x5x444=444x445, or 197,580 pieces of data.

For instance:

  1 xxxx,xxxx,xxxx,xxxx,xxxx,
  2 xxxx,xxxx,xxxx,xxxx,xxxx,
    ...
    ...
 89 xxxx,xxxx,xxxx,xxxx,
 90 xxxx,xxxx,xxxx,xxxx,xxxx,
 91 xxxx,xxxx,xxxx,xxxx,xxxx,
    ...
    ...
178 xxxx,xxxx,xxxx,xxxx,

I have added the line number to be more clear in my question. I do not want it in the actual output.

What would be an efficient and pythonic way of doing so?

For the moment, I am trying to adapt the answer to this question to my case:

Write to list to rows with specific number of columns.

Community
  • 1
  • 1
muammar
  • 951
  • 2
  • 13
  • 32
  • I don't understand. Do you want 444 separate csv files...? – Rick Jun 05 '15 at 11:49
  • @RickTeachey well, in some way yes. Because 5 columns times 89 lines is 444. I tried reshaping my array to (89,5,444) but np.savetxt does not give me back what I need. – muammar Jun 05 '15 at 11:56
  • if your array is of shape 444x445, there are 197,580 pieces of data. if i understand, what you want is a csv file that is 39,516 lines long (with 5 columns), correct? and the 89th line, 178th line, etc, will end in 'nan'. is that right? – Rick Jun 05 '15 at 13:09
  • 1
    @RickTeachey that is correct. – muammar Jun 05 '15 at 13:18
  • 1
    Actually the 89th line, 178th line, etc, should not end with 'nan' but have one less element. I don't think this can be achieved with a single call to `numpy.savetxt()`. With masked arrays you can obtain something like `xxxx,xxxx,,` i.e. not a row with one less element, but a row with an empty element. – Stefano M Jun 05 '15 at 13:26
  • One thing you should do is provide an attempt at doing this yourself in the form of actual code. It doesn't have to be much, but it should be a solid "first try". You're much more likely to get good answers that way. – Rick Jun 05 '15 at 13:27
  • @StefanoM Upon reading it through again, I see that yes you're right. – Rick Jun 05 '15 at 13:28
  • @RickTeachey I think I succeeded in having a code that seems to work. Let me test it better, and I will put it here. – muammar Jun 05 '15 at 13:31
  • If it does work, make sure you post it as an answer to your own question. There is nothing wrong with that - in fact it is encouraged. – Rick Jun 05 '15 at 13:36

1 Answers1

1

Hope I well understand what you are asking for

# Reshape it

array_.reshpe(89,444,5)

# Change it's dtype to str so you can replace NaN by white spaces

array_.astype(str)

# Replace nan by white spaces

array_[array_ == 'nan'] = ''


# Finaly, save it SEE EDIT

Edit

I think that np.savetxt wouldn"t work with numpy arrays with more than 2 dimension, so, and referring to this answer we can try this:

# Write the array to disk
with file('test.txt', 'w') as outfile:
    # I'm writing a header here just for the sake of readability
    # Any line starting with "#" will be ignored by numpy.loadtxt
    outfile.write('# Array shape: {0}\n'.format(array_.shape))

    # Iterating through a ndimensional array produces slices along
    # the last axis. This is equivalent to array_[i,:,:] in this case
    for data_slice in array_:

        # The formatting string indicates that I'm writing out
        # the values in left-justified columns 7 characters in width
        # with 2 decimal places.  
        np.savetxt(outfile, data_slice, fmt='%-7.2f')

        # Writing out a break to indicate different slices...
        outfile.write('# New slice\n')
Community
  • 1
  • 1
farhawa
  • 10,120
  • 16
  • 49
  • 91
  • Sorry @muammar it's `array_[array_ == 'nan'] = ''` – farhawa Jun 05 '15 at 14:21
  • There is also `reshape`. I like a lot your solution, but when you try to save the `.astype(str)` array without the `nan` -> ` TypeError: float argument required, not numpy.ndarray`. – muammar Jun 05 '15 at 14:32
  • @muammar, I realize that there is a problem with saving a numpy array with (a,b,c) as shape. take a look to the edit and tell me if it works or you want to customize it – farhawa Jun 05 '15 at 14:42
  • To save in disk, the fmt has to be set to `fmt='%s'`, otherwise it fails. Thank you very much for your help. I will mark your solution as the correct answer. – muammar Jun 05 '15 at 14:57
  • it's my pleasure dude =) – farhawa Jun 05 '15 at 14:58