0

I am new to python and I am getting something weird stuff in my python console while displaying output into the screen.

>>> macbeth_sentsence = gutenberg.sents('shakespeare-macbeth.txt');
>>> macbeth_sentsence
[[u'[', u'The', u'Tragedie', u'of', u'Macbeth', u'by', u'William', u'Shakespeare', u'1603', u']'], [u'Actus', u'Primus', u'.'], ...]

I am not expecting the extra 'u' character in my output screen.

Anyone knows how to suppress that? Anything to do with default python settings?


UPDATED from BELOW: for people who didn't understood what I want to resolve.

When the same command was executed in my Windows system in Vbox

I got something like this:

>>> macbeth_sentsence = gutenberg.sents('shakespeare-macbeth.txt');
    >>> macbeth_sentsence
    [['[', 'The', 'Tragedie', 'of', 'Macbeth', 'by', 'William', 'Shakespeare', '1603', ']'], ['Actus', 'Primus', '.'], ...]

I want the same result in my Mac machine; what adjustment I have to do into my defaults to get result like that I got in Windows.

PS: Answers from this: removing `u` character in python output and this: Python ascii utf unicode and this : Python string prints as [u'String'] are not what I am looking for.

Community
  • 1
  • 1
kishoredbn
  • 2,007
  • 4
  • 28
  • 47

2 Answers2

0

It's just indicating a unicode string. see Easiest way to remove unicode representations from a string in python 3?

Community
  • 1
  • 1
Gadi
  • 1,152
  • 9
  • 6
0

It's a Unicode string, you can do the following to remove it by converting to ascii:

macbeth_sentence = [[i.encode() for i in j] for j in macbeth_sentence]
Malik Brahimi
  • 16,341
  • 7
  • 39
  • 70
  • Revoked down vote, but that didn't answered my question. How to make this change permanent. – kishoredbn Feb 22 '15 at 20:41
  • It is permanent. Try printing `new` and you'll see that there is no preceding u. – Malik Brahimi Feb 22 '15 at 20:42
  • I think the OP means how to do it in the original list – Padraic Cunningham Feb 22 '15 at 20:43
  • @MalikBrahimi~ By permanent I meant next time if I write something like {>>> macbeth_sentsence = gutenberg.sents('shakespeare-macbeth.txt'); >>> macbeth_sentsence} I don't have to encode macbeth_sentsence, but it should come as default without u. – kishoredbn Feb 22 '15 at 20:45
  • Is this good now ikis? I used a list comprehension to iterate through the entire nested list and I encoded every string to ascii. – Malik Brahimi Feb 22 '15 at 20:55