2

I have a Pandas dataframe with some correlations in the form of:

   A     B  
D  0.78  0.49 
E  0.93  0.67

Is there a fast way in Python to get a list of tuples like: [(A, D, 0.78), (A, E, 0.93), (B, D, 0.49), (B, E, 0.67)]

Thanks in advance for your help.

Chris Adams
  • 18,389
  • 4
  • 22
  • 39
mirkojh
  • 43
  • 4

3 Answers3

3

Use DataFrame.unstack for reshape, then convert Series to DataFrame and last convert nested lists to tuples:

L = [tuple(x) for x in df.unstack().reset_index().to_numpy()]

Or:

L = list(map(tuple, df.unstack().reset_index().to_numpy()))

Another idea, thank you @Datanovice:

L = list(df.unstack().reset_index().itertuples(name=None,index=None))

print (L)
[('A', 'D', 0.78), ('A', 'E', 0.93), ('B', 'D', 0.49), ('B', 'E', 0.67)]

If order should be swapped, thank you @Ch3steR:

L = list(df.reset_index().melt(id_vars='index').itertuples(name=None,index=None)) 
print (L)
[('D', 'A', 0.78), ('E', 'A', 0.93), ('D', 'B', 0.49), ('E', 'B', 0.67)]
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • 2
    Another alternative using `df.melt` : `list(df.reset_index().melt(id_vars='index').itertuples(name=None,index=None))`. Nice answer +1 – Ch3steR Jul 01 '20 at 10:56
0

Try this,

import pandas as pd
dt = {'a': [1, 2], 'b': [3, 4]}
cols = ['a', 'b']
rows = ['d', 'e']
df = pd.DataFrame(dt, index=rows)
print(df)

    a   b
d   1   3
e   2   4



result = []
for c in cols:
    for r in rows:
        result.append((c, r, df[c][r]))
print(result)
[('a', 'd', 1), ('a', 'e', 2), ('b', 'd', 3), ('b', 'e', 4)]

devspartan
  • 594
  • 7
  • 15
0

Sure, it is possible. I would do that like this:

import pandas as pd
import numpy as np

# Creating example DF
tab = pd.DataFrame(data={'A': (1,2), 'B': (3,4)})
tab.index=['C', 'D']

# Values to tuples
np.array(tab.apply(lambda x: [(x.index[i], x.name,y) for i, y in enumerate(x)])).ravel()