I'm trying to filter columns of DataFrame with a unicode regex. I need the code to be compatible with both python2 and python3.
df.filter(regex=u'证券代码')
The code throws error in python2
File "D:\Applications\Anaconda2\lib\site-packages\pandas\core\generic.py", line 2469, in filter
axis=axis_name)
File "D:\Applications\Anaconda2\lib\site-packages\pandas\core\generic.py", line 1838, in select
np.asarray([bool(crit(label)) for label in axis_values])]
File "D:\Applications\Anaconda2\lib\site-packages\pandas\core\generic.py", line 2468, in <lambda>
return self.select(lambda x: matcher.search(str(x)) is not None,
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-3: ordinal not in range(128)
So, I write a unit test:
class StrTest(unittest.TestCase):
def test_str(self):
str(u'证券代码')
It reports same error.
Any idea about this error? How do I filter DataFrame with a unicode regex?