140

Yes in short i would like to know why am I seeing a u in front of my keys and values.

I am rendering a form. The form has check-box for the particular label and one text field for the ip address. I am creating a dictionary with keys being the label which are hardcoded in the list_key and values for the dictionary are taken from the form input (list_value). The dictionary is created but it is preceded by u for some values. here is the sample output for the dictionary:

{u'1': {'broadcast': u'on', 'arp': '', 'webserver': '', 'ipaddr': u'', 'dns': ''}}

can someone please explain what I am doing wrong. I am not getting the error when i simulate similar method in pyscripter. Any suggestions to improve the code are welcome. Thank you

#!/usr/bin/env python

import webapp2
import itertools
import cgi

form ="""
    <form method="post">
    FIREWALL 
    <br><br>
    <select name="profiles">
        <option value="1">profile 1</option>
        <option value="2">profile 2</option>
        <option value="3">profile 3</option>
    </select>
    <br><br>
    Check the box to implement the particular policy
    <br><br>

    <label> Allow Broadcast
        <input type="checkbox" name="broadcast">
    </label>
    <br><br>

    <label> Allow ARP
        <input type="checkbox" name="arp">
    </label><br><br>

    <label> Allow Web traffic from external address to internal webserver
        <input type="checkbox" name="webserver">
    </label><br><br>

    <label> Allow DNS
        <input type="checkbox" name="dns">
    </label><br><br>

    <label> Block particular Internet Protocol  address
        <input type="text" name="ipaddr">
    </label><br><br>

    <input type="submit">   
    </form>
"""
dictionarymain={}

class MainHandler(webapp2.RequestHandler):  
    def get(self):
        self.response.out.write(form)

    def post(self):
        # get the parameters from the form 
        profile = self.request.get('profiles')

        broadcast = self.request.get('broadcast')
        arp = self.request.get('arp')
        webserver = self.request.get('webserver')
        dns =self.request.get('dns')
        ipaddr = self.request.get('ipaddr')


        # Create a dictionary for the above parameters
        list_value =[ broadcast , arp , webserver , dns, ipaddr ]
        list_key =['broadcast' , 'arp' , 'webserver' , 'dns' , 'ipaddr' ]

        #self.response.headers['Content-Type'] ='text/plain'
        #self.response.out.write(profile)

        # map two list to a dictionary using itertools
        adict = dict(zip(list_key,list_value))
        self.response.headers['Content-Type'] ='text/plain'
        self.response.out.write(adict)

        if profile not in dictionarymain:
            dictionarymain[profile]= {}
        dictionarymain[profile]= adict

        #self.response.headers['Content-Type'] ='text/plain'
        #self.response.out.write(dictionarymain)

        def escape_html(s):
            return cgi.escape(s, quote =True)



app = webapp2.WSGIApplication([('/', MainHandler)],
                              debug=True)
jdi
  • 90,542
  • 19
  • 167
  • 203
user1488987
  • 1,453
  • 2
  • 10
  • 9
  • 2
    Is your actual question "Why am I seeing a `u` in front of my keys and values"? – jdi Jul 01 '12 at 03:30
  • And you don't show anywhere that you are getting an error in the first place. – jdi Jul 01 '12 at 03:43
  • 3
    That's because they're unicode strings: http://stackoverflow.com/questions/599625/python-string-prints-as-ustring – user Jul 01 '12 at 03:43

2 Answers2

196

The 'u' in front of the string values means the string is a Unicode string. Unicode is a way to represent more characters than normal ASCII can manage. The fact that you're seeing the u means you're on Python 2 - strings are Unicode by default on Python 3, but on Python 2, the u in front distinguishes Unicode strings. The rest of this answer will focus on Python 2.

You can create a Unicode string multiple ways:

>>> u'foo'
u'foo'
>>> unicode('foo') # Python 2 only
u'foo'

But the real reason is to represent something like this (translation here):

>>> val = u'Ознакомьтесь с документацией'
>>> val
u'\u041e\u0437\u043d\u0430\u043a\u043e\u043c\u044c\u0442\u0435\u0441\u044c \u0441 \u0434\u043e\u043a\u0443\u043c\u0435\u043d\u0442\u0430\u0446\u0438\u0435\u0439'
>>> print val
Ознакомьтесь с документацией

For the most part, Unicode and non-Unicode strings are interoperable on Python 2.

There are other symbols you will see, such as the "raw" symbol r for telling a string not to interpret backslashes. This is extremely useful for writing regular expressions.

>>> 'foo\"'
'foo"'
>>> r'foo\"'
'foo\\"'

Unicode and non-Unicode strings can be equal on Python 2:

>>> bird1 = unicode('unladen swallow')
>>> bird2 = 'unladen swallow'
>>> bird1 == bird2
True

but not on Python 3:

>>> x = u'asdf' # Python 3
>>> y = b'asdf' # b indicates bytestring
>>> x == y
False
user2357112
  • 260,549
  • 28
  • 431
  • 505
jdi
  • 90,542
  • 19
  • 167
  • 203
  • 1
    Thank you ..just to make it clear, what i understand that i will not get error operating on dictionary with string represented as unicode. – user1488987 Jul 01 '12 at 05:23
  • @user1488987: Correct. You can have unicode in your dict – jdi Jul 01 '12 at 05:30
  • is raw string the same thing as byte string? – Omar Khazamov Feb 23 '23 at 16:20
  • @OmarKhazamov no a raw string means python will not interpret special characters. You don't have to escape "\n" as "\\n" for example. It's better to use a raw string when writing a regular expression, for instance. Bytes are entirely different. – jdi Feb 24 '23 at 18:11
  • @jdi what's the use case of non-unicode strings? (i.e why someone would want to use them if unicode are available in both python2 and python3) – Omar Khazamov Feb 24 '23 at 20:58
  • @OmarKhazamov bytes are not encoded for visual representation. There are many binary formats that are meant to be read and written and transferred around. But they do not have a representation that can be usefully encoded into unicode – jdi Feb 25 '23 at 23:34
12

This is a feature, not a bug.

See http://docs.python.org/howto/unicode.html, specifically the 'unicode type' section.

Sean W
  • 139
  • 3