32

I have a NumPy ndarray to which I would like to add row/column headers.

The data is actually 7x12x12, but I can represent it like this:

  A=[[[0, 1, 2, 3, 4, 5],
      [1, 0, 3, 4, 5, 6],
      [2, 3, 0, 5, 6, 7],
      [3, 4, 5, 0, 7, 8],
      [4, 5, 6, 7, 0, 9],
      [5, 6, 7, 8, 9, 0]]


     [[0, 1, 2, 3, 4, 5],
      [1, 0, 3, 4, 5, 6],
      [2, 3, 0, 5, 6, 7],
      [3, 4, 5, 0, 7, 8],
      [4, 5, 6, 7, 0, 9],
      [5, 6, 7, 8, 9, 0]]]

where A is my 2x6x6 array.

How do I insert headers across the first row and the first column, so that each array looks like this in my CSV output file?

        A, a, b, c, d, e, f 
        a, 0, 1, 2, 3, 4, 5,
        b, 1, 0, 3, 4, 5, 6,
        c, 2, 3, 0, 5, 6, 7,
        d, 3, 4, 5, 0, 7, 8,
        e, 4, 5, 6, 7, 0, 9,
        f, 5, 6, 7, 8, 9, 0

What I have done is made the array 7x13x13 and inserted the data such that I have a row and column of zeros, but I'd much prefer strings.

I guess I could just write an Excel macro to replace the zeros with strings. However, the problem is that NumPy cannot convert string to float, if I try to reassign those zeros as the strings I want.

desertnaut
  • 57,590
  • 26
  • 140
  • 166
emmagras
  • 1,378
  • 2
  • 18
  • 25
  • Im not familiar with numpy but this would be very straight forward if they were just lists. Would a solution of that order be acceptable? can you convert numpy matrices to lists? – Paul Seeb Jun 19 '12 at 18:21
  • The way numpy matrices work, they can function as lists of lists, so you can iterate over them just fine on their own. – JAB Jun 19 '12 at 18:25

5 Answers5

45

With pandas.DataFrame.to_csv you can write the columns and the index to a file:

import numpy as np
import pandas as pd

A = np.random.randint(0, 10, size=36).reshape(6, 6)
names = [_ for _ in 'abcdef']
df = pd.DataFrame(A, index=names, columns=names)
df.to_csv('df.csv', index=True, header=True, sep=' ')

will give you the following df.csv file:

  a b c d e f 
a 1 5 5 0 4 4 
b 2 7 5 4 0 9 
c 6 5 6 9 7 0 
d 4 3 7 9 9 3 
e 8 1 5 1 9 0 
f 2 8 0 0 5 1    
bmu
  • 35,119
  • 13
  • 91
  • 108
20

Numpy will handle n-dimensional arrays fine, but many of the facilities are limited to 2-dimensional arrays. Not even sure how you want the output file to look.

Many people who would wish for named columns overlook the recarray() capabilities of numpy. Good stuff to know, but that only "names" one dimension.

For two dimensions, Pandas is very cool.

In [275]: DataFrame.from_items([('A', [1, 2, 3]), ('B', [4, 5, 6])],
   .....:                      orient='index', columns=['one', 'two', 'three'])
Out[275]: 
   one  two  three
A    1    2      3
B    4    5      6

If output is the only problem you are trying to solve here, I'd probably just stick with a few lines of hand coded magic as it will be less weighty than installing another package for one feature.

LCC
  • 848
  • 12
  • 10
Phil Cooper
  • 5,747
  • 1
  • 25
  • 41
3

Think this does the trick generically

Input

mats = array([[[0, 1, 2, 3, 4, 5],
    [1, 0, 3, 4, 5, 6],
    [2, 3, 0, 5, 6, 7],
    [3, 4, 5, 0, 7, 8],
    [4, 5, 6, 7, 0, 9],
    [5, 6, 7, 8, 9, 0]],

   [[0, 1, 2, 3, 4, 5],
    [1, 0, 3, 4, 5, 6],
    [2, 3, 0, 5, 6, 7],
    [3, 4, 5, 0, 7, 8],
    [4, 5, 6, 7, 0, 9],
    [5, 6, 7, 8, 9, 0]]])

Code

# Recursively makes pyramiding column and row headers
def make_head(n):
    pre = ''
    if n/26:
        pre = make_head(n/26-1)

    alph = "abcdefghijklmnopqrstuvwxyz"
    pre+= alph[n%26]
    return pre

# Generator object to create header items for n-rows or n-cols
def gen_header(nitems):
    n = -1
    while n<nitems:
        n+=1
        yield make_head(n)

# Convert numpy to list
lmats = mats.tolist()

# Loop through each "matrix"
for mat in lmats:
    # Pre store number of columns as we modify it before working rows
    ncols = len(mat[0])

    # add header value to front of each row from generator object
    for row,hd in zip(mat,gen_header(len(mat))):
        row.insert(0,hd)

    # Create a "header" line for all the columns
    col_hd = [hd for hd in gen_header(ncols-1)]
    col_hd.insert(0,"A")

    # Insert header line into lead row of matrix
    mat.insert(0,col_hd)

# Convert back to numpy
mats = numpy.array(lmats)

Output (value stored in mats):

array([[['A', 'a', 'b', 'c', 'd', 'e', 'f'],
        ['a', '0', '1', '2', '3', '4', '5'],
        ['b', '1', '0', '3', '4', '5', '6'],
        ['c', '2', '3', '0', '5', '6', '7'],
        ['d', '3', '4', '5', '0', '7', '8'],
        ['e', '4', '5', '6', '7', '0', '9'],
        ['f', '5', '6', '7', '8', '9', '0']],

       [['A', 'a', 'b', 'c', 'd', 'e', 'f'],
        ['a', '0', '1', '2', '3', '4', '5'],
        ['b', '1', '0', '3', '4', '5', '6'],
        ['c', '2', '3', '0', '5', '6', '7'],
        ['d', '3', '4', '5', '0', '7', '8'],
        ['e', '4', '5', '6', '7', '0', '9'],
        ['f', '5', '6', '7', '8', '9', '0']]], 
      dtype='|S4')
Paul Seeb
  • 6,006
  • 3
  • 26
  • 38
  • I'm getting an error `'numpy.ndarray' object has no attribute 'insert'` Any workaround suggestions? – emmagras Jun 19 '12 at 21:17
  • Work around included. I converted the numpy mats to lists did the operations and converted back. Numpy insert routines are pretty stupid or I fail to see how they are useful – Paul Seeb Jun 21 '12 at 13:27
  • Thank you. I eventually figured it out with this. – emmagras Jul 02 '12 at 16:48
2

I am not aware of any method to add headers to the matrix (even though I would find it useful). What I would do is to create a small class that prints the object for me, overloading the __str__ function.

Something like this:

class myMat:
    def __init__(self, mat, name):
        self.mat = mat
        self.name = name
        self.head = ['a','b','c','d','e','f']
        self.sep = ','

    def __str__(self):
        s = "%s%s"%(self.name,self.sep)
        for x in self.head:
            s += "%s%s"%(x,self.sep)
        s = s[:-len(self.sep)] + '\n'

        for i in range(len(self.mat)):
            row = self.mat[i]
            s += "%s%s"%(self.head[i],self.sep)
            for x in row:
                s += "%s%s"%(str(x),self.sep)
            s += '\n'
        s = s[:-len(self.sep)-len('\n')]

        return s

Then you could just easily print them with the headers, using the following code:

print myMat(A,'A')
print myMat(B,'B')
Oriol Nieto
  • 5,409
  • 6
  • 32
  • 38
  • This looks promising. In trying to distill my question down, I confused matters, as the big matrix is not actually composed of labelled smaller matrices. I've tried to split it up and implement your suggestion but it's not working. For starters, I have a "list index out of range" at this line s += "%s%s"%(self.head[i],self.sep) How would your suggestion change given that A is the only matrix, rather than dealing with a matrix that is a compilation of matrices? – emmagras Jun 19 '12 at 21:07
  • I guess that you get an index out of range error due to different size of the matrices. Right now this code will only work with 6x6 matrices (i.e. len(['a','b','c','d','e','f'])). Just change the line that defines self.head to your matrix size (e.g. if your matrices are 3x3, the line should look like self.head=['a','b','c']). Hope this helps! – Oriol Nieto Jun 20 '12 at 14:18
2

Not really sure, but you may consider having a look at Pandas.

Davide
  • 1,415
  • 2
  • 11
  • 13