How to drop rows from a pandas dataframe where any column contains a symbol I don't want

Question

I have a csv file encoded in ANSI which I'm formatting with python pandas on a non ANSI machine. The resulting dataframe('df1') has some garbage in it.

Expirydate      food     color
20150713        banana   yellow
20150714        steak    brown
???             ???(g?0) ???

I am trying to remove the 'garbage' line using this:

df1[df1.Expirydate.str.contains("?")==False]

but am getting this error:

sre_constants.error: nothing to repeat

Can anybody help? It would be most appreciated!

a non-ansi machine? Python can read ansi, just load the csv data with `pandas.read_csv('filename', encoding='ansi')` or use python3, which solves all encoding problems automagically — firelynx, Jul 14 '15 at 08:05
I just tried that but got the following error `unknown encoding: ansi`. Also read [here](http://stackoverflow.com/questions/22279413/python-convert-encodinglookuperror-unknown-encoding-ansi) that there is no ansi encoding in standard encodings. :-( — qts, Jul 14 '15 at 08:16
You're right. But according to http://stackoverflow.com/questions/700187/unicode-utf-ascii-ansi-format-differences you are probably looking for an existing encoding, as ansi can mean many things, but your data is definately saved with a certain encoding. The link suggests to try 'cp1252' — firelynx, Jul 14 '15 at 09:18

score 2 · Accepted Answer · answered Jul 14 '15 at 07:59

2

The pattern ? is treated as a regular expression. To actually match literal ? in the content, you can escape it:

df1[df1.Expirydate.str.contains('\?')==False]

answered Jul 14 '15 at 07:59

YS-L

14,358
3
47
58

upvoted this answer :-) however it seems that it is still not able to pick up the 'garbage' line. I'm not sure whether it's because it can't handle regular expressions, even though escaped. Eventually I got around it by doing this `df1[~df1.Expirydate.str.contains('2015')]`. – qts Jul 14 '15 at 09:07

How to drop rows from a pandas dataframe where any column contains a symbol I don't want

1 Answers1