1

I try to build a dataframe with the first columns of several other dataframes with a loop. All of them have same index.

df1 = pd.DataFrame(np.random.randint(0,100,size=(3, 2)), columns=list('AB'), index=('class1', 'class2', 'class3'))
df2 = pd.DataFrame(np.random.randint(0,100,size=(3, 2)), columns=list('CD'), index=('class1', 'class2', 'class3')) 
df3 = pd.DataFrame(np.random.randint(0,100,size=(3, 2)), columns=list('EF'), index=('class1', 'class2', 'class3'))

df = pd.DataFrame( index=('class1', 'class2', 'class3')) 

for f in [df1, df2, df3]:
    first_col = f.iloc[:,0]
    df[f] = first_col.values

The expected output is a matrix with same formatting as below:

         A   C   E
class1   2  18  62
class2  46  46  11
class3  57  73  92

But this code did not work.

The question mirror this query, but the answers tried (below) did not work. How to add a new column to an existing DataFrame?

df.set_index([first_col], append=True)

df.assign(f=first_col.values)

2 Answers2

1

Your solution is possible if change name of new columns in ouput DataFrame:

df = pd.DataFrame(index=('class1', 'class2', 'class3')) 

r = [df1, df2, df3] 
for f in r: 
    first_col = f.iloc[:,0] 
    df[f.columns[0]] = first_col
print (df)
         A   C   E
class1  10  25  85
class2  45  57  48
class3  59  99  87

Better is use concat with list comprehension for select first column:

r=[df1, df2, df3] 

df = pd.concat([f.iloc[:, 0] for f in r], axis=1)
print (df)
         A   C   E
class1   2  18  62
class2  46  46  11
class3  57  73  92
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
1

You don't need a for loop or comprehension to perform merge operations

To merge the first column of each DataFrame, you can use the concat() function and iloc() method together from pandas as shown below code:

# Merge the first columns
merged_df = pd.concat([df1.iloc[:, 0], df2.iloc[:, 0], df3.iloc[:, 0]], axis=1)
merged_df

Output:

        A   C   E
class1  36  75  29
class2  68  48  74
class3  90  42  70
Dejene T.
  • 973
  • 8
  • 14