0

I have two dataframes

df1:

| Ref        | Call        |
|:-----------|------------:|
| E/123      | Research    |   
| E/164      | Research    |   
| E/256      | Research    |    
| E/543      | Fellow      |    
| E/213      | Fellow      |   
| E/457      | Fellow      |

df2:

| Ref        | Grade       | Person       |
|:-----------|------------:|:------------:|
| E/123      |        6    |   A.Smith    | 
| E/164      |        4    |   P.Davies   | 
| E/256      |        4    |   E.Carter   |  
| E/543      |        8    |   D.White    |  
| E/213      |        9    |   R.Daniels  |  
| E/457      |        2    |   S.Conner   |

I want to add a new column to df2, which takes the 'Call' column from df1, and then matches the call to the ref number i.e.:

| Ref        | Grade       | Person       | Call     |
|:-----------|------------:|:------------:|:---------|
| E/123      |        6    |   A.Smith    | Research |
| E/164      |        4    |   P.Davies   | Research |
| E/256      |        4    |   E.Carter   | Research | 
| E/543      |        8    |   D.White    | Fellow   |
| E/213      |        9    |   R.Daniels  | Fellow   | 
| E/457      |        2    |   S.Conner   | Fellow   |

Obviously my script is not as simple as the above example (i.e. it isnt in a nice order, it has to be matched ref to ref), so df1 is a list of ALL the grants we have and their respective call, df2 is a 'random' sample which can be big or small.

eyllanesc
  • 235,170
  • 19
  • 170
  • 241
Nicholas
  • 3,517
  • 13
  • 47
  • 86
  • Did you just do `pd.concat([df1, df2], 1)`? – cs95 Oct 25 '17 at 10:00
  • Ill try now. I didnt realise it was that simple. – Nicholas Oct 25 '17 at 10:01
  • If it works, this is a dupe. Otherwise, edit your data to account for the failing case. ;-) – cs95 Oct 25 '17 at 10:02
  • Just tried it, it doesnt seem to be doing anything? – Nicholas Oct 25 '17 at 10:06
  • Okay. Then, try: `df1.merge(df2, on='Ref')`. Seriously! – cs95 Oct 25 '17 at 10:07
  • Ill try now. Thank you – Nicholas Oct 25 '17 at 10:07
  • Sorry, the concat is working, but it is bringing through the ref number (i.e. I now have two reference numbers instead of bringing the call through)? – Nicholas Oct 25 '17 at 10:11
  • You can just call `df.drop('Ref', 1)` on one of the data frames. Alternatively, merge will drop that duplicate column automatically. – cs95 Oct 25 '17 at 10:13
  • The merge is not working unfortunately i.e. nothing happens (no errors, but doesnt add the call column), and the concat is not bringing the call column over – Nicholas Oct 25 '17 at 10:19
  • This Will Work. `pd.concat([df1.drop('Ref', 1), df2], axis=1)` – cs95 Oct 25 '17 at 10:20
  • There must be something wrong with my dataset. It still isnt copying the call into the new dataframe. I checked the headers of the df1 and it says 'Call', 'Ref', yet the 'Call' never gets added. Oh well, I will mess around and try and find a fix. Thank you ColdSpeed – Nicholas Oct 25 '17 at 10:27
  • Found my problem. I was printing it to excel which was using column values I had set up before (which did not have 'Call' in)... I am an idiot, and now I wont make this mistake again. Thank you! – Nicholas Oct 25 '17 at 10:31
  • No problem. Happy coding. – cs95 Oct 25 '17 at 10:33
  • Sorry, this question needs to be opened up again. I forgot what I was trying to do... I am not trying to merge two datasets together, I am trying to match calls against ref numbers i.e. df1 could be 10 lines long and df2 could be 20,000. If I concat or merge, it just adds 10 lines to df2 which could be 20,000 lines long. – Nicholas Oct 25 '17 at 10:37
  • Better still, delete this and repost, and make sure you edit your question to reflect this! – cs95 Oct 25 '17 at 10:38
  • Ok, thank you! I will do that now – Nicholas Oct 25 '17 at 10:38

0 Answers0