0

I have some data in a dataframe (df):

    Reference   Description
0   C11621/2    The findings have been used
1   B01026/1    Findings from this research
2   D01469/1    The PopChange web resource 
3   AM0156/1    The whole project was designed 
4   AM0156/1    The data set has been used 
5   AM0156/1    This project has become one 

There might be duplicates in the 'Reference' column, and if there is, I want to merge the data together to make only one row i.e. in the dataframe above, the below 3 rows have duplicated Reference numbers:

    Reference   Description
3   AM0156/1    The whole project was designed  ...
4   AM0156/1    The data set has been used ...
5   AM0156/1    This project has become one ...

I want to turn that into:

    Reference   Description
3   AM0156/1    The whole project was designed The data set has been used This project has become one

How would one go about that?

Nicholas
  • 3,517
  • 13
  • 47
  • 86
  • 1
    So use `df.groupby('Reference')['Description'].apply(' '.join).reset_index()` – jezrael Jun 26 '18 at 08:38
  • 1
    Thank you Jezrael. I did search for an answer before asking a question, but I didnt find that post. Thank you very much :) – Nicholas Jun 26 '18 at 08:39

0 Answers0