0

I have a class inside a .py file that contains two functions.

new_column generates a new column based on values of existing columns (Col1 and Col2).

transform_df applies the previous function to a dataframe, via a lambda expression.

The end result looks like this:

Col1    Col2    col3
0   a   b   ab
1   a   b   ab
2   a   c   None
3   a   b   ab

How do I use these functions in a new file/notebook?

# kept within file df_functions.py
class Functions():

    def __init__(self, path):
        self.path = path # path to .csv

    # function to create new column, based on Col1 and Col2 values
    def new_column(self, row):
        if (row['Col1'] == 'a') & (row['Col2'] == 'b'):
            return 'ab'

    # apply previously defined function via lambda expression
    def transform_df(self, path):        
        df = pd.read_csv(self.path)

        # apply function 'new_column' to df
        df['col3'] = df.apply(lambda row: self.new_column(row), axis=1)

        # other potential functions applications here

        return df

I have tried the following:

from df_functions import Functions

df_path = '../datafile.csv'
FunctionsObject = Functions(path=df_path)

new_df = FunctionsObject.transform_df(path=df_path)

However this returns

NameError: ("name 'new_column' is not defined", 'occurred at index 0')
Thomas
  • 158
  • 9
  • 1
    `FunctionsObject.transform_df()` this must have caused error since `transform_df()` requires `path` parameter. This shows that your code has few missing statements, without which we cannot determine what went wrong. – Ja8zyjits Mar 20 '19 at 09:48
  • 1
    What is the point of having the two functions be part of a class? You are passing the path to `transform_df` anyway – FlyingTeller Mar 20 '19 at 09:49
  • 2
    Additionally, there is inconsistency in your class name. Is it called `Functions` or is it called `FunnelFunctions`? – FlyingTeller Mar 20 '19 at 09:50
  • I suggest adding pandas tag to the question, because it is specific to the library – ikamen Mar 20 '19 at 09:55
  • Correcting the points @FlyingTeller mentioned, your code works as expected here. I did not get a NameError. – AlCorreia Mar 20 '19 at 09:58
  • 1
    Please provide a minimal example that allows us to reproduce your error. There are currently several inconsistencies in your code. For example, you import ``Functions`` but use ``FunnelFunctions``, and you define ``transform_df`` as taking a ``path`` parameter but do not supply one. – MisterMiyagi Mar 20 '19 at 09:58
  • In what kind of environment are you running your code? What python version? A ``NameError`` should only contain the "name <> is not defined" part, not a tuple with the "occurred at index 0" part attached. – MisterMiyagi Mar 20 '19 at 10:13
  • @AlCorreia I corrected the inconsistencies too, however the example only began working when I restarted the kernel. Seems that I needed to restart the kernel after making changes to the class in the .py file. – Thomas Mar 20 '19 at 16:14
  • 1
    @ThomasBloomfield, that makes sense. If you want to keep your modules up to date in your notebook, you can use the [autoreload magic](https://stackoverflow.com/a/5399339/8107620). – AlCorreia Jun 14 '19 at 09:52

1 Answers1

0

Appears to be a problem related to making edits to the .py file containing the class. After making changes, I restarted the kernel and the imported functions work as expected.

Thomas
  • 158
  • 9