I have a dataframe of the following structure:
mydf:
Entry Address ShortOrdDesc
0 988 Fake Address 1 SC_M_W_3_1
1 989 Fake Address 2 SC_M_W_3_3
2 992 Fake Address 3 nan_2
3 992 SC_M_G_1_1
4 992 SC_M_O_1_1
There is work to be done on this df to combine rows with the same Entry. For these only the first row has Address. I need to concatenate the ShortOrdDesc column and Address. I found a very useful link on this:
Pandas groupby: How to get a union of strings
Working from this I have developed the following function:
def f(x):
return pd.Series(dict(A = x['Entry'].sum(),
B = x['Address'].sum(),
C = "%s" % '; '.join(x['ShortOrdDesc'])))
Which is applied using
myobj = ordersToprint.groupby('Entry').apply(f)
This returns the error:
TypeError: must be str, not int
Looking at my data I don't see what the issue is, as running .sum() on the integers of 'Entry' should work I believe.
What is the error in my code or my approach?