0
def unique_unit_split(df):
    df_unit_list = df_master.loc[df_master['type'] == 'unit']
    df_unit_list = df_unit_list.key.tolist()

    for i in range(len(df_unit_list)):
        df_unit_list[i] = int(df_unit_list[i])

    split_1 = df_units.units.str.split('[","]',expand=True).stack()
    df_units_update = df_units.join(pd.Series(index=split_1.index.droplevel(1), data=split_1.values, name='unit_split'))
    df_units_final = df_units_update[df_units_update['unit_split'].isin(df_unit_list)]

    return(df) 

Updated script: still not working

df_unit_list = []
split_1 = pd.DataFrame()
df_units_update = pd.DataFrame()
df_units_final = pd.DataFrame()

def unique_unit_split(df):
    df_unit_list = df_master.loc[df_master['type'] == 'unit']
    df_unit_list = df_unit_list.key.tolist()

    for i in range(len(df_unit_list)):
        df_unit_list[i] = int(df_unit_list[i])

    split_1 = df_units.units.str.split('[","]',expand=True).stack()
    df_units_update = df_units.join(pd.Series(index=split_1.index.droplevel(1), data=split_1.values, name='unit_split'))
    df_units_final = df_units_update[df_units_update['unit_split'].isin(df_unit_list)]

    return(df)

Above function originally worked when I split up the two actions (code inclusive of the for loop and above was in a function then everything below split_1 was in another function). Now that I tried to condense them, I am getting a NameError (image attached). Anyone know how I can resolve this issue and ensure my final df (df_units_final) is defined?

For more insight on this function: I have a df with comma separated values in one column and I needed to split that column, drop the [] and only keep rows with the #s I need which were defined in the list created "df_unit_list". NameError Details

  • 1
    The line of code where the error is coming from is not part of the code you posted. While merging the methods you probably missed the reference to `df_units_final` - so you need to figure out how that variable fits into the whole flow of things. Also the method in your post doesn't seem to do anything - it just returns `df` (its argument) without ever touching it. It creates a new local called `df_units_final` but that has nothing to do with any `df_units_final` outside the function – rdas Feb 05 '20 at 22:04
  • 1
    `df_units_final ` isn't defined outside the function, so you can't pass it in as an argument to the function – G. Anderson Feb 05 '20 at 22:04
  • 1
    Does this answer your question? [Short description of the scoping rules?](https://stackoverflow.com/questions/291978/short-description-of-the-scoping-rules) – G. Anderson Feb 05 '20 at 22:04
  • I just tried creating blank dataframes but it's still not working.. df_unit_list = [] split_1 = pd.DataFrame() df_units_update = pd.DataFrame() df_units_final = pd.DataFrame() – Amanda Wishnie Feb 05 '20 at 22:08
  • @rdas - I was trying to create a new dataframe through the function as an output but should I not do that and use generalized terms instead? – Amanda Wishnie Feb 05 '20 at 22:11

1 Answers1

0

The issue was stated above (not defining df_units_final) AND my for_loop was forcing the list to be int when the values in the other df were actually strings.

Working Code