extra characters on python list

Question

I'm having an issue issue with:

slots = rows[i].find_elements_by_tag_name('td')
prodFolder = slots[0].text
prodType = slots[2].text
prodId = slots[1].text
values = [prodFolder, prodId, prodType]
print values

When I go to print values I get an extra character at the front of each item in the list:
[u'active_e', u'1193', u'Active E']
This is probably the result of .text providing some extra data that I do not want. Is there an elegant way to solve this? (not using brute force to remove the extra u's?)

See http://stackoverflow.com/questions/34986329/converting-list-of-strings-with-u-to-a-list-of-normal-strings — PM 2Ring, Jun 06 '16 at 13:11
It's not extra character. It's a representation of `unicode` character. — Rahul K P, Jun 06 '16 at 13:11
`values = [str(i.text) for i in slots]`. And `print values`.Change your like this. — Rahul K P, Jun 06 '16 at 13:12
@RahulKP this will raise a `UnicodeEncodeError` if there are any non-ascii characters in there. Use `i.text.encode('utf-8')` instead. — user2390182, Jun 06 '16 at 13:27

user2390182 · Accepted Answer · 2016-06-06T13:21:37.207

The 'u' in u'active_e' just indicates that this is a unicode object, and not a bytestring. You can use encode to convert it:

> u = u'active_e'
> s = u.encode('utf-8')

> u
u'active_e'
> s
'active_e'

# But:
> print(u)
active_e
> print(s)
active_e

> type(u)
<type 'unicode'>
> type(s)
<type 'str'>

But in most contexts, unicode objects are just as fine as bytestrings. For pure ASCII strings, even u == s will be True:

> u == s
True

# careful with non-ascii chars:
> u = u'äöüß'
> s = u.encode('utf-8')
> u == s
False

> len(u)
4
> len(s)
8  # ä,ö,ü,ß have two-byte representations

extra characters on python list

1 Answers1