Although question might not be very clear, but still I think posting an answer would be better than deleting it.
So as I saw in above results when transform was applied on the whole Groupby
object it returned the function applied on whole
series and values duplicated whereas when I applied the function on individual series or groups it performed the transform function on each single element i.e. like the apply function of series.
After searching through the documentation and seeing the output of a custom function below this is what I get.
The groupby transform function directly passes the object to the function and checks its output whether it matches the length of passed object or it's a scaler in which it expands the output to that length.
But in series transform object, the function first tries to use apply
function on the object and in case it fails then applies the function on whole object.
This is what I got after reading the source code, you can also see the output below, I created a function and called it on both transforms
def func(val):
print(type(val))
return ','.join(val.tolist())
# For series transforms
<class 'str'>
<class 'str'>
# For groupby transforms
<class 'pandas.core.series.Series'>
Now if I modify the function such that it can work only on whole series object and not on individual strings then observe how the series transform function behaves
# Modified function (cannot work only on strings)
def func(val):
print(type(val))
return val.str.split().str[0]
#For Series transforms
<class 'str'>
<class 'pandas.core.series.Series'>