0

I have a pandas series as so:

A   1
B   2
C   3
AB  4
AC  5
BA  4
BC  8
CA  5
CB  8

Simple code to convert to a matrix as such:

1 4 5
4 2 8
5 8 3

Something fairly dynamic and built in, rather than many loops to fix this 3x3 problem.

smci
  • 32,567
  • 20
  • 113
  • 146
Dickster
  • 2,969
  • 3
  • 23
  • 29
  • Post reproducible code. – smci Jun 16 '15 at 23:10
  • Near-duplicate of [Convert pandas dataframe to numpy array, preserving index](http://stackoverflow.com/questions/13187778/pandas-dataframe-to-numpy-array-include-index) and [How to convert a pandas DataFrame subset of columns AND rows into a numpy array?](http://stackoverflow.com/questions/17682613/how-to-convert-a-pandas-dataframe-subset-of-columns-and-rows-into-a-numpy-array) – smci Jun 16 '15 at 23:30

2 Answers2

2

You can do it this way.

import pandas as pd

# your raw data
raw_index = 'A B C AB AC BA BC CA CB'.split()
values = [1, 2, 3, 4, 5, 4, 8, 5, 8]

# reformat index
index = [(a[0], a[-1]) for a in raw_index]
multi_index = pd.MultiIndex.from_tuples(index)

df = pd.DataFrame(values, columns=['values'], index=multi_index)
df.unstack()


df.unstack()
Out[47]: 
  values      
       A  B  C
A      1  4  5
B      4  2  8
C      5  8  3
Jianxun Li
  • 24,004
  • 10
  • 58
  • 76
0

For pd.DataFrame uses .values member or else .to_records(...) method

For pd.Series use .unstack() method as Jianxun Li said

import numpy as np
import pandas as pd

d = pd.DataFrame(data = {
    'var':['A','B','C','AB','AC','BA','BC','CA','CB'],
    'val':[1,2,3,4,5,4,8,5,8] })

# Here are some options for converting to np.matrix ...
np.matrix( d.to_records(index=False) )
# matrix([[(1, 'A'), (2, 'B'), (3, 'C'), (4, 'AB'), (5, 'AC'), (4, 'BA'),
#         (8, 'BC'), (5, 'CA'), (8, 'CB')]], 
#       dtype=[('val', '<i8'), ('var', 'O')])

# Here you can add code to rearrange it, e.g.
[(val, idx[0], idx[-1]) for val,idx in d.to_records(index=False) ]
# [(1, 'A', 'A'), (2, 'B', 'B'), (3, 'C', 'C'), (4, 'A', 'B'), (5, 'A', 'C'), (4, 'B', 'A'), (8, 'B', 'C'), (5, 'C', 'A'), (8, 'C', 'B')]

# and if you need numeric row- and col-indices:
[ (val, 'ABCDEF...'.index(idx[0]), 'ABCDEF...'.index(idx[-1]) ) for val,idx in d.to_records(index=False) ]
# [(1, 0, 0), (2, 1, 1), (3, 2, 2), (4, 0, 1), (5, 0, 2), (4, 1, 0), (8, 1, 2), (5, 2, 0), (8, 2, 1)]

# you can sort by them:
sorted([ (val, 'ABCDEF...'.index(idx[0]), 'ABCDEF...'.index(idx[-1]) ) for val,idx in d.to_records(index=False) ], key=lambda x: x[1:2] )
smci
  • 32,567
  • 20
  • 113
  • 146