17

How do I copy multiple columns from one dataframe to a new dataframe? it would also be nice to rename them at the same time

df2['colA']=df1['col-a']  #This works

df2['colA', 'colB']=df1['col-a', 'col-b'] #Tried and Failed

Thanks

MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
RMichalowski
  • 330
  • 1
  • 4
  • 11
  • Possible duplicate of [Fastest way to copy columns from one DataFrame to another using pandas?](http://stackoverflow.com/questions/21295329/fastest-way-to-copy-columns-from-one-dataframe-to-another-using-pandas) – iled May 16 '17 at 19:31

4 Answers4

27

You have to use double brackets:

df2[['colA', 'colB']] = df1[['col-a', 'col-b']]
Andy Hayden
  • 359,921
  • 101
  • 625
  • 535
  • 1
    Does that return already a copy or would one need `df1[['col-a', 'col-b']].copy()`? – Cleb Aug 23 '17 at 23:00
  • 2
    @Cleb You don't need the copy as the view (?) of df1 is being assigned into the df2. It's not destructive, the view gets GCd afterwards. – Andy Hayden Aug 23 '17 at 23:21
4

This is tried and tested for pandas=1.3.0 :

df2 = df1[['col-a', 'col-b']].copy()

If you also want to rename the column names at the same time, you can write:

df2 = pd.DataFrame(columns=['colA', 'colB'])
df2[['colA', 'colB']] = df1[['col-a', 'col-b']]
s_mehrotra
  • 137
  • 6
3

Following also works:

# original DataFrame
df = pd.DataFrame({'a': ['hello', 'cheerio', 'hi', 'bye'], 'b': [1, 0, 1, 0]})
# new DataFrame created from 2 original cols (new cols renamed)
df_new = pd.DataFrame(columns=['greeting', 'mode'], data=df[['a','b']].values)

If you want to use condition for the new dataframe:

df_new = pd.DataFrame(columns=['farewell', 'mode'], data=df[df['b']==0][['a','b']].values)

Or if you want use just particular rows (index), you can use "loc":

df_new = pd.DataFrame(columns=['greetings', 'mode'], data=df.loc[2:3][['a','b']].values)

# if you need preserve row index, then add index=... as argument, like:
df_new = pd.DataFrame(columns=['farewell', 'mode'], data=df.loc[2:3][['a','b']].values, 
                      index=df.loc[2:3].index )
Lukas
  • 2,034
  • 19
  • 27
2

As for Pandas=1.2.4, the easiest way would be:

df2[['colA', 'colB']] = df1[['col-a', 'col-b']].values

note that it doesn't require .copy() as applying values first converts dataframe values into a numpy array (shallow copy) and then copy values (link the address of array in the memory) into dataframe. Exactly the same as when you apply copy() at the end of it.