0

I'm using Jupyterlab and was trying to save a function as a module in my cwd, so I could call it from a separate module file.

So I was reading other similar posts and saved it as outlier.py and placed in my cwd.

this is the function:

import pandas 

def remove_pps_outliers(df):
    df_out = pandas.DataFrame()  #taking new dataframe as output
    for key, subdf in df.groupby('location'): # grouping by location
    # for  x in DF: (subdf = df.groupby(''))
        m = np.mean(subdf.price_per_sqft) # per location getting subdataframe
        st = np.std(subdf.price_per_sqft) # per location getting subdataframe (this means 1 standard deviation)
        
        reduced_df = subdf[(subdf.price_per_sqft>(m-st)) & (subdf.price_per_sqft<=(m+st))]
        # filtering all the datapoints that are >(dist)  & anything below       <=(m+st)
        # (THINK OF A NORMAL DISTRIBUTION CURVE    |___SD___ME|AN__SD__|)
        
        
        df_out = pandas.concat(([df_out, reduced_df]), ignore_index=True)
        #                   [ objs ,    axis   ] 
        #                  [add these 2 together] 
        # I will keep on appending these two df PER LOCATION
    return df_out
```

this is how I called it:

from outliers import *


df7 = remove_pps_outliers(df6)
df7.shape

and this is the error im getting:

~\outliers.py in remove_pps_outliers(df)
      1 import pandas
----> 2 
      3 def remove_pps_outliers(df):
      4     df_out = pandas.DataFrame()  #taking new dataframe as output
      5     for key, subdf in df.groupby('location'): # grouping by location

NameError: name 'pd' is not defined

Help?

Dimmak
  • 1
  • 1
  • 1
    That code did not produce that error. Show us the whole traceback. It IS traditional to do `import pandas as pd`, but it's not absolutely required, as long as you don't call it `pd`. – Tim Roberts May 12 '21 at 18:42
  • from outliers import * df7 = remove_pps_outliers(df6) df7.shape – Dimmak May 12 '21 at 18:44
  • I did try to just do import pandas, then change all "pd" into "pandas" but it didnt work – Dimmak May 12 '21 at 18:47
  • If you have more information, please edit your question. Code can't be formatted in comments. – Tim Roberts May 12 '21 at 18:47
  • Ok I placed the trackback into the question, unless you need more information, sorry Im new to stackoverflow – Dimmak May 12 '21 at 18:57
  • The "traceback" means the entire error message, showing what was called from where. As I said, the code you are showing us COULD NOT have produced the error you are showing. – Tim Roberts May 12 '21 at 19:02
  • I guess that the problem is not with the code. I know that beginners often assume that if they edit the function code in a file and the re run the cell with import then the updated function will be loaded. It won't. You will have to restart the kernel or use autoreload IPython magic to see it updated. – krassowski May 12 '21 at 19:05
  • For autoreload reference: https://stackoverflow.com/questions/5364050/reloading-submodules-in-ipython – krassowski May 12 '21 at 19:06
  • @krassowski I tried to restarting the kernal, I also tried saving the file in spider ide since jupyterlab could have issues with it. Im not sure how to get the traceback, but I think the issue lies in that the module im trying to call (is itself calling) pandas in it? – Dimmak May 12 '21 at 20:08

0 Answers0