There are really only two types of string (or string-like object) in Python.
The first is 'Unicode' strings, which are a sequence of characters.
The second is bytes (or 'bytestrings'), which are a sequence of bytes.
The first is a series of letter characters found in the Unicode specification.
The second is a series of integers between 0 and 255 that are usually rendered to text using some assumed encoding such as ASCII or UTF-8 (which is a specification for encoding Unicode characters in a bytestream).
In Python 2, the default "my string"
is a bytestring.
The prefix 'u' indicates a 'Unicode' string, e.g. u"my string"
.
In Python 3, 'Unicode' strings became the default, and thus "my string"
is equivalent to u"my string"
.
To get the old Python 2 bytestrings, you use the prefix b"my string" (not in the oldest versions of Python 3).
There are two further prefixes, but they do not affect the type of string object, just the way it is interpreted.
The first is 'raw' strings which do not interpret escape characters such as \n or \t. For example, the raw string r"my_string\n"
contains the literal backslash and 'n' character, while "my_string\n"
contains a linebreak at the end of the line.
The second was introduced in the newest versions of Python 3: formatted strings with the prefix 'f'. In these, curly braces are used to show expressions to be interpreted. For example, the string in:
my_object = 'avocado'
f"my {0.5 + 1.0, my_object} string"
will be interpreted to "my (1.5, avocado) string"
(where the comma created a tuple). This interpretation happens immediately when the code is read; there is nothing special subsequently about the string.
And finally, you can use the multiline string notation:
"""this is my
multiline
string"""
with 'r' or 'f' specifiers as you wish.
In Python 2, if you have used no prefix or only an 'r' prefix, it is a bytestring, and if you have used a 'u' prefix it is a Unicode string.
In Python 3, if you have used no prefix or only a combination of 'r', 'f' and 'u', it is a Unicode string. If you have used a 'b' prefix it is a bytestring. Using both 'b' and 'u' is obviously not allowed.