dfply - Python - X name is undefined

Question

I'm using package dfply in python which mimics the package dplyr in R. This is the simple code I'm trying to run. I have this dataset 'data' previously loaded in my environment and I just want to group for that variable.

    import dfply as dp
    data['CO_SPORTELLO']=data['CO_SPORTELLO'].apply(lambda x: str(x))
    data=(data >> 
          dp.group_by(X.CO_SPORTELLO))

The error I keeping getting is: NameError: name 'X' is not defined.

From the package documentation:

The DataFrame as it is passed through the piping operations is represented by the symbol X. It records the actions you want to take (represented by the Intention class), but does not evaluate them until the appropriate time. Operations on the DataFrame are deferred. Selecting two of the columns, for example, can be done using the symbolic X DataFrame during the piping operations.

diamonds >> select(X.carat, X.cut) >> head(3)

   carat      cut
0   0.23    Ideal
1   0.21  Premium
2   0.23     Good

well, you aren't defining `X` – Aidenhjj Mar 26 '18 at 08:34 — Aidenhjj, Mar 26 '18 at 08:34

score 1 · Accepted Answer · answered Mar 26 '18 at 08:41

1

You need to from dfply import *; that should define X.

Either that or replace X with dp.X in your code.

answered Mar 26 '18 at 08:41

Aidenhjj

1,249
1
14
27

score 0 · Answer 2 · answered Mar 13 '20 at 07:09

0

No need to use 'dp.group_by' just 'group_by' would works.

answered Mar 13 '20 at 07:09

navin kharade

1
2

dfply - Python - X name is undefined

2 Answers2