I have written a function in python:
import pandas as pd
from pandas import DataFrame, read_csv
import numpy as np
import os
def textFileReader(text_file_location,delimiter,header,columns_needed,*agrs):
"""when file has header and all columns are required,first section will be executed"""
"""when file has header and selected columns are required,second will be executed"""
"""when file hasn't header, last section will be executed"""
if ((header =="yes") and (columns_needed=="all")):
file_name = pd.read_csv(text_file_location,sep=delimiter,header=0)
col_names = [x.lower() for x in list(file_name.columns)]
file_name.columns = col_names
elif ((header=="yes") and (columns_needed=="columns")):
file_name = pd.read_csv(text_file_location,sep=delimiter,header=0)
file_name = file_name[columns]
col_names = [x.lower() for x in list(file_name.columns)]
file_name.columns = col_names
else:
file_name = pd.read_csv(text_file_location,sep=delimiter)
"""drop na's and duplicates from the data"""
nas_dropped = file_name.dropna(axis = 0, how = 'all')
duplicates_removed = file_name.drop_duplicates()
"""basic view of data"""
print duplicates_removed.shape
print duplicates_removed.dtypes
print duplicates_removed.head()
return duplicates_removed
I am using this function in other program abc.py
import pandas as pd
from pandas import DataFrame, read_csv
import numpy as np
import os
from textFileReader import textFileReader
path = os.path.join('/home','ubuntu','Nov_2014','amgen_content_recommendation_demo_data.tsv')
textFileReader.columns = ['Phy_Id','Specialty','Age']
demo_data = textFileReader(path,"\t","yes","columns")
I am getting an error :
NameError: global name 'columns' is not defined
I am passing the value of columns in elif section of my function. I am trying here to change values assigned to columns baesd on requirement while reading a text file. In another iteration, I may assign some other column names. Is ther any way to do this?
Though the error I am getting is same from many previous questions, what I am trying to do here is different. I have created a local variable "columns" in my main function and then assigning a list to that in my main program where the funcion has been called.