1

I have a source dataframe which needs to be looped through for all the values of Comments which are Grouped By values present in corresponding Name field and the result needs to be appended as a new column in the DF. This can be into a new DataFrame as well.

Input Data :

      Name            Comments
0     N-1               Good
1     N-2               bad
2     N-3               ugly
3     N-1         very very good
4     N-3          what is this
5     N-4            pathetic
6     N-1         needs improvement
7     N-2         this is not right
8     Ano-5      It is average

[8 rows x 2 columns]

For example - For all values of Comments of Name N-1, run a loop and add the output as a new column along with these 2 values (of Name, Comment).

I tried to do the following, and was able to group by based on Name. But I am unable to run through all values of Comments for them to append the output :

gp = CommentsData.groupby(['Document'])

for g in gp.groups.items():
    Data1 = CommentsData.loc[g[1]]
    #print(Data1)

Data in Group by loop comes like :

    Name        Comments
0   N-1           good
3   N-1     very very good
6   N-1      needs improvement


1   N-2           bad
7   N-2      this is not right

I am unable to access the values in 2nd column. Using df.iloc[i] - I am only able to access first element. But not all (as the number of elements will vary for different values of Names).

Now, I want to use the values in Comment and then add the output as an additional column in the dataframe(can be a new DF).

Expected Output :

     Name            Comments                Result
0     N-1               Good                   A
1     N-2               bad                    B 
2     N-3               ugly                   C 
3     N-1         very very good               A
4     N-3          what is this                B
5     N-4            pathetic                  C 
6     N-1         needs improvement            C
7     N-2         this is not right            B
8     Ano-5         It is average              B

[8 rows x 3 columns]
languitar
  • 6,554
  • 2
  • 37
  • 62
Shankar Pandey
  • 451
  • 1
  • 4
  • 22
  • Are you looking for `apply()` ? – Jan Oct 28 '16 at 11:46
  • @Jan - Thanks. Yes, I was looking for something like this. When I tried apply() : `Data2 = Data1.apply(str,axis=1)` `print(Data2)` I am getting weird output in the following format : `2 Name N-1\nComments ...` `16 Name N-1\nComments...` – Shankar Pandey Oct 28 '16 at 12:00
  • Check pivot, in this answer: http://stackoverflow.com/questions/22798934/pandas-long-to-wide-reshape#35087831 - You just want to make a long table wide. – kabanus Oct 28 '16 at 13:15

1 Answers1

0

you can use apply and reset_index

df.groupby('Name').Comments.apply(pd.DataFrame.reset_index, drop=True).unstack()

enter image description here

piRSquared
  • 285,575
  • 57
  • 475
  • 624
  • What I've achieved is a list of data below: `[array([u'N-1'], dtype=object), u'Good', 'A', array([u'N-2'], dtype=object), u'Bad', 'B']` So 1 row of data is split into 3 elements (N-1,Good, A) and then another row follows making it difficult to seperate row-1 data from row-2 data (N-2,bad, B). I'm now stuck how to read 3 elements at a time and use them. – Shankar Pandey Oct 29 '16 at 12:00
  • I;ve solved it thanks. I looped through the elements and then joined the ones related. Thanks for your responses. – Shankar Pandey Nov 01 '16 at 21:11