4

I have a list with a single string that contains non-ascii characters. My goal is to get rid of the non-ascii characters and convert the list to a string.

Every time I try to strip out the non-ascii characters, I get this error: 'list' object has no attribute 'read'

I've tried most of these and I still get this error every time. I'm not sure what I am doing wrong, any help would be appreciated.

Community
  • 1
  • 1
imns
  • 4,996
  • 11
  • 57
  • 80
  • 23
    show some code, we aren't psychic. – Winston Ewert Nov 30 '10 at 01:56
  • 3
    not even code really. Just your input and expected output would suffice. I can tell you right now though that you're passing something a list when you should be passing it a file. Unless you're actually trying to call `read` on a list... – aaronasterling Nov 30 '10 at 01:59

3 Answers3

2

Py3:

thelist[0].encode('ascii','ignore').decode()

this works for python 2.x:

import string
filter(lambda c:c in string.printable, thelist[0])
Kabie
  • 10,489
  • 1
  • 38
  • 45
  • Why do you mark your first bit as Python 3? It works fine in Python 2.6. – Chris Morgan Nov 30 '10 at 03:04
  • @kabie: `printable` is a **subset** of `ascii`; you are throwing away more data than the OP intends. – John Machin Nov 30 '10 at 03:23
  • @Chris Morgan: Throw an error on my Python 2.6. @John Machin: You are right. should be like "filter(curses.ascii.isascii, thelist[0])" – Kabie Nov 30 '10 at 22:08
  • @Kabie: `u'f\x81oo\xf1bar'.encode('ascii','ignore').decode() == u'foobar'` – Chris Morgan Nov 30 '10 at 22:20
  • @Chris Morgan: Because you are using a unicode string, which is the default string in py3. 'f\x81oo\xf1bar'.encode('ascii','ignore').decode() will not work. – Kabie Nov 30 '10 at 22:58
  • @Kable: very well then, if you're assuming you're starting with a `str`, `str(unicode(thelist[0], 'ascii', 'ignore'))` will do. – Chris Morgan Nov 30 '10 at 23:19
  • @Chris Morgan: which is not working on py3. But good enough for py2.x. – Kabie Nov 30 '10 at 23:47
0
result = ''.join([s.encode('ascii','ignore') for s in mylist])
Hugh Bothwell
  • 55,315
  • 8
  • 84
  • 99
  • This is converting `unicode` to `str` (py2) or `str` to `bytes` (py3) ... probably not what the OP expects. – John Machin Nov 30 '10 at 03:27
  • @Hugh, your answer inspired this u''.join(mylist).encode('ascii', 'ignore') not addressing @John here, just thinking about the idea of u''.join(). – kevpie Nov 30 '10 at 20:05
0

For this you want to activate the virtaulenv

From this way it worked!

Shadeer
  • 69
  • 1
  • 3