11

I'm trying to replicate the code that is provided here: https://github.com/IdoZehori/Credit-Score/blob/master/Credit%20score.ipynb

The function given below fails to run and give error. Can someone help me resolving it

def replaceOutlier(data, method = outlierVote, replace='median'):
'''replace: median (auto)
            'minUpper' which is the upper bound of the outlier detection'''
vote = outlierVote(data)
x = pd.DataFrame(zip(data, vote), columns=['annual_income', 'outlier'])
if replace == 'median':
    replace = x.debt.median()
elif replace == 'minUpper':
    replace = min([val for (val, vote) in list(zip(data, vote)) if vote == True])
    if replace < data.mean():
        return 'There are outliers lower than the sample mean'
debtNew = []
for i in range(x.shape[0]):
    if x.iloc[i][1] == True:
        debtNew.append(replace)
    else:
        debtNew.append(x.iloc[i][0])

return debtNew

Function Call:

incomeNew = replaceOutlier(df.annual_income, replace='minUpper')

Error: x = pd.DataFrame(zip(data, vote), columns=['annual_income', 'outlier']) TypeError: data argument can't be an iterator

PS: I understand this has been asked before, but I tried using the techniques however the error still remains

user4943236
  • 5,914
  • 11
  • 27
  • 40

5 Answers5

19

zip cannot be used directly, you should give the result as a list i.e.:

x = pd.DataFrame(list(zip(data, vote)), columns=['annual_income', 'outlier'])

Edit (from bayethierno answer) :
Since the release 0.24.0, we don't need to generate the list from the zip anymore, the following statement is valid :

x = pd.DataFrame(zip(data, vote), columns=['annual_income', 'outlier'])
PRMoureu
  • 12,817
  • 6
  • 38
  • 48
2

This actually works in pandas version 0.24.2 without having to use list around zip

bayethierno
  • 159
  • 1
  • 12
1

write like this

coef = DataFrame(list(zip(x.columns,np.transpose(log_model.coef_))))
Nimeshka Srimal
  • 8,012
  • 5
  • 42
  • 57
0

This happens because of data type issue, you can first convert it to list and then that list to dataframe. Ex. pd.DataFrame(list(data)) should work.

0

zip(list1,list2) works in Jupyter Notebook, but I find that list(zip(list1,list2)) is required to work in Python's default compiler.

Tomerikoo
  • 18,379
  • 16
  • 47
  • 61
code-freeze
  • 465
  • 8
  • 8