Can't pass iterable into pandas df to drop a row with a specific value

Question

I am trying to load a bunch of csvs into a database and would like to get rid of any rows from these tables that have the value "-". I'm trying to do the same thing in the folllowing link but using an iterable instead of predetermined column as I don't know which tables and columns will have these values:

Deleting DataFrame row in Pandas based on column value

My code: dfs = {}

for doc in fList:
    i = "{}\\{}".format(path, doc)

    df = pd.read_csv(i)

    for col in df.columns:
        df = df[df.col != "-"]

This returns the following error:

AttributeError                            Traceback (most recent call last)
<ipython-input-291-43edac7a4ed7> in <module>()
      8     #print dfs
      9     for col in df:
---> 10         df = df[df.col != "-"]

C:\ProgramData\Anaconda2\lib\site-packages\pandas\core\generic.py in __getattr__(self, name)
   2968             if name in self._info_axis:
   2969                 return self[name]
-> 2970             return object.__getattribute__(self, name)
   2971 
   2972     def __setattr__(self, name, value):

AttributeError: 'DataFrame' object has no attribute 'col'

It seems that I cannot use the iterable in the loop. It would defeat the perpose of writing the script if I have to open each file and change the values. Is there anyway to loop through the tables and delete rows with the bad values?

what happens when you try `df = df.loc[df[col] != '-']?` – MattR Jan 15 '18 at 21:54 — MattR, Jan 15 '18 at 21:54
@MattR same TypeError: invalid type expression – geoJshaun Jan 15 '18 at 22:09 — geoJshaun, Jan 15 '18 at 22:09

cs95 · Accepted Answer · 2018-01-15T23:17:40.360

3

You cannot dynamically access df's column using a variable as you are trying, that leads to an AttributeError. Because the . will search for df's attribute col, and not df's attribute <value in col>. There's a difference.

If you wanted to, you'd need the __getitem__ accessor; df[col]. However, you should avoid loopy solutions where you can. Here are a couple of alternatives.

Option 1
For your case, eq + any should suffice.

df = df[df.astype(str).eq('-').any(1)]                # `astype` conversion

Or,

df = df[df.select_dtypes(['object']).eq('-').any(1)]  # `select_dtypes`, thanks MaxU!

Option 2
Another option would be to use a na_values argument with read_csv, so when reading in your data, these values are converted to NaN, which you can drop.

df = pd.read_csv('file.csv', na_values=['-'])

And now, call dropna on your data -

df.dropna(inplace=True)

edited Jan 15 '18 at 23:17

answered Jan 15 '18 at 21:57

cs95

379,657
97
704
746

@COLDSPEED tried your solution and got this: TypeError: Could not compare ['-'] with block values – geoJshaun Jan 15 '18 at 22:07
@ShaunO `df[df.astype(str).eq('-').any(1)]` it seems like you don't have all string columns. – cs95 Jan 15 '18 at 22:09
@COLDSPEED that did it! Weird, I was sure they all were strings. Thank you so much! – geoJshaun Jan 15 '18 at 22:12
@ShaunO No problem, feel free to vote on, and accept the answer if it was helpful. I'd appreciate it ;-) – cs95 Jan 15 '18 at 22:13
1

@COLDSPEED Done! – geoJshaun Jan 15 '18 at 22:15
2

a little improvement: `df = df[df.select_dtypes(['object']).eq('-').any(1)]` – MaxU - stand with Ukraine Jan 15 '18 at 22:21
@COLDSPEED Sorry. I'm actually getting some strange results with this. I want to delete rows with "-" in an effort to clean data. All the columns need to be numeric to perform stats on them and the "-" (census data for Nan) is throwing it all off. When I run the .eq solution it changed one entire column "-" and then randomly changed other column values in another table to "-". – geoJshaun Jan 15 '18 at 23:13
@ShaunO Wow, you should've said so. In that case, When calling `read_csv`, add a `na_values=['-']` argument, so those values are made null. Then, just call `dropna`. – cs95 Jan 15 '18 at 23:14
@COLDSPEED yes, a thousand apologies – geoJshaun Jan 16 '18 at 00:14

Can't pass iterable into pandas df to drop a row with a specific value

1 Answers1