I'm using package dfply
in python which mimics the package dplyr
in R.
This is the simple code I'm trying to run. I have this dataset 'data' previously loaded in my environment and I just want to group for that variable.
import dfply as dp
data['CO_SPORTELLO']=data['CO_SPORTELLO'].apply(lambda x: str(x))
data=(data >>
dp.group_by(X.CO_SPORTELLO))
The error I keeping getting is: NameError: name 'X' is not defined
.
From the package documentation:
The DataFrame as it is passed through the piping operations is represented by the symbol X. It records the actions you want to take (represented by the Intention class), but does not evaluate them until the appropriate time. Operations on the DataFrame are deferred. Selecting two of the columns, for example, can be done using the symbolic X DataFrame during the piping operations.
diamonds >> select(X.carat, X.cut) >> head(3)
carat cut
0 0.23 Ideal
1 0.21 Premium
2 0.23 Good