0

I tried to use pandas.DataFrame.replace() to fill empty strings in a frame via the code below.

main_datasets = [train_data, test_data, alt_data1, alt_data2]

for data in main_datasets:
    for x in data:
        if type(x) == 'int':
            x.fillna(x.median(), inplace = True)
        elif type(x) == 'float':
            x.fillna(x.median(), inplace = True)
        else:
            x.replace(to_replace = None, value = 'N/A', inplace = True)    

However, I continue to receive the following exception, despite using characters or numbers for the keyword value and removing the keyword name:

TypeError: replace() takes no keyword arguments

Why is this error raised from "value = 'N/A?'"

Here is an ad hoc sample of some fields (separated by commas):

12123423, 0, M, Y, 120432.5

12654423, 1, F, N, 80432.5

12123423, 0, M, Y, 120432.5

12123423, 0, M, Y, 120432.5

Community
  • 1
  • 1
Louis
  • 185
  • 1
  • 2
  • 12
  • 2
    Can you post a small sample of your dataframe with a desired output? See: [mcve] – user3483203 Jun 13 '18 at 17:44
  • The error message you're receiving is because you're passing keyword arguments to the `replace()` method--what a surprise! /s You need to pass positional arguments. Try `replace(None, 'N/A', True)` :) – natn2323 Jun 13 '18 at 17:53

2 Answers2

3

You are trying to use the pandas.DataFrame.replace() method but you are actually using the string.replace method.

I'd suspect that all of this would fail but you are hitting the failure on the not int and not float case first. If x is an int, than it is not a DataFrame and wouldn't have the .fillna method. Similarly, strings have Python's replace method but that has a different function signature.

df.replace() vs string.replace()

str.replace(old, new[, max])

vs

DataFrame.replace(to_replace=None, value=None, inplace=False, limit=None, regex=False, method='pad', axis=None)

I also suspect you are trying to use Series.dtype rather than type.

Zev
  • 3,423
  • 1
  • 20
  • 41
  • Yes, I tried to use dtype but I kept receiving an error using the fields in the dataframe. If I'm understanding correctly, there is another replace() method in the namespace. – Louis Jun 13 '18 at 18:22
  • More similar to different methods on different classes. When you use dot notation, that's accessing a different namespace. – Zev Jun 13 '18 at 20:10
  • It makes sense. I figured the different methods were in completely different modules. However, I'm still don't understand why x as an int wouldn't be a DataFrame able to access the .fillna() method. Are you saying pd.DataFrame objects don't recognize numerical inputs as 'int' dtypes? – Louis Jun 13 '18 at 20:20
  • A series (or column in a df) of ints is different than an actual int. If you are down to the actual int like `5`, it no longer has the methods from pandas – Zev Jun 13 '18 at 20:24
  • Try `print(x, type(x))` and see if you get a primative like `int` or `string`. If so, a primative doesn't have access to the methods of its container. You are probably acting on items when you want to act on columns. – Zev Jun 13 '18 at 20:30
0

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.replace.html

As mentioned in the comment just do

x.replace(None, 'N/A', True)
Dan
  • 41
  • 6
  • I'm not sure what that link has to do with your answer. Also, if posting an answer from the comments, generally, you'd want to make it a community wiki. – Zev Jun 13 '18 at 18:07
  • @Zev Yea my bad, not a good post, I would've commented but I can't yet cuz I don't have enough points. I think your answer above is right though. – Dan Jun 13 '18 at 18:17