1

I'm using pandas for a thesis assignment and got stuck on the following

MY data is as below where I have multiple entries for Full Names with one authID in the second column.

   Full_Name           author_ID
   SVANTE ARRHENIUS      5C5007F5
   SVANTE ARRHENIUS      76E05190

I'm trying to update the data so I have one row per author with all corresponding authorIDs in the second column as such:

     Full_Name               author_ID
    SVANTE ARRHENIUS       [5C5007F5,76E05190]

Sorry if this is a very basic question. I've been stuck on it for a while and can't figure it out :(

Jelle Eitjes
  • 75
  • 1
  • 3

1 Answers1

0

Let's say you have a Data Frame object created as:

     DF_obj=DataFrame([['Ravi',1234],['Ragh',12345],['Ravi',14567]])

     DF_obj.columns=['Full_Name','Author_ID']

     group_by=DF_obj.groupby('Full_Name')['Author_ID'].apply(list)
     group_by

     Out[]
        Full_Name
        Ragh          [12345]
        Ravi    [1234, 14567]
        Name: Author_ID, dtype: object
Scott Boston
  • 147,308
  • 15
  • 139
  • 187