I am working Jupyter notebook on AWS Sagemaker instance. For convenience I wrote a .py file with couple of functions as defined;
#function to gather the percent of acts in each label feature combo
def compute_pct_accts(data, label_cnt):
"""
data is the output from aggregate_count
labe_cnt gives the breakdown of data for each target value
"""
label_data_combined = pd.merge(data, label_cnt, how='inner', left_on= 'label', right_on = 'label')
label_data_combined['Act_percent'] = np.round((label_data_combined['ACT_CNT']/label_data_combined['Total_Cnt'])*100,2)
return label_data_combined
#write a function to perform aggregation for target and feature column
def aggregate_count(df, var, target):
"""
df is the dataframe,
var is the feature name
target is the label varaible(0 or 1)
"""
label_var_cnt = df.groupby([var,target],observed=True)['ID'].count()
label_var_cnt = label_var_cnt.reset_index()
label_var_cnt.rename(columns={'ID':'ACT_CNT'},inplace=True)
return label_var_cnt
Both these functions are stored in a .py file called file1.py. Then to retrieve them in my notebook I typed;
from file1 import *
import pandas as pd
This command did import both functions. But when I tried to run the function;
compute_pct_accts(GIACT_Match_label_cnt, label_cnt)
I am getting a Name error;
pd not found
Please note that I have imported pandas as pd in my jupyter notebook. I am aware of using the option
%run -i compute_pct_accts_new.py
but that forces me to write a new python file with that function. My question is, can we have one python file with all functions defined in it, so that we can import all of them at once and use interactively in notebook. Help is appreciated.