0

I am using scrappy spider and my own item pipeline

 value['Title'] = item['Title'][0] if ('Title' in item) else ''
        value['Name'] = item['Name'][0] if ('CompanyName' in item) else ''
        value['Description'] = item['Description'][0] if ('Description' in item) else ''

When i do this i am getting the value prefixed with u

Example : When i pass the value to o/p and print it

value['Title'] = u'hospital'

What went wrong in my code and why i am getting u and how to remove it

Can anyone help me ?

Thanks,

backtrack
  • 7,996
  • 5
  • 52
  • 99

2 Answers2

2

The u means that the string is represented as unicode. You can remove the u by passing the string to str. str(u'test'). But you can treat is as normal string for most purposes. For example

>>> u'test' == 'test'
True

If you have characters that cannot be represented with plain ascii you should keep the unicode way. If you call str on non ascii characters you will get an exception.

>>> test=u'বাংলা'
>>> test
u'\u09ac\u09be\u0982\u09b2\u09be'
>>> str(test)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-4: ordinal not in range(128)

The u is not part of the string, it is just a way to indicate the type of the string.

>>> type('test')
<type 'str'>
>>> type(u'test')
<type 'unicode'>

Se the following question for more details:

What does the 'u' symbol mean in front of string values?

Community
  • 1
  • 1
toftis
  • 1,070
  • 9
  • 26
  • str(u'test') Will it remove all the unicode characters or only the "u". The reason is i my get Unicode from webpages. That's why – backtrack Mar 06 '15 at 13:50
  • See my edit. The u is not part of the actual string. It is just a way to specify what type the string have. – toftis Mar 06 '15 at 14:08
1

To remove the u sign you may encode the string as ASCII like this: value['Title'].encode("ascii").

ForceBru
  • 43,482
  • 10
  • 63
  • 98