82

I was playing with python and I realized we don't need to use '+' operator to concatenate static strings. But it fails if I assign it to a variable.

For example:

string1 = 'Hello'   'World'  #1 works fine
string2 = 'Hello' + 'World'  #2 also works fine

string3 = 'Hello'
string4 = 'World'
string5 = string3   string4  #3 causes syntax error
string6 = string3 + string4  #4 works fine

Now I have two questions:

  1. Why statement 3 does not work while statement 1 does?
  2. Is there any technical difference such as calculation speed etc. between statement 1 and 2?
ibrahim
  • 3,254
  • 7
  • 42
  • 56
  • 4
    this is just similar to C/C++ where `"hello " "world"` will automatically concatenated together – phuclv Jun 04 '14 at 04:48
  • 1
    There's a bug report on this behavior, but it was rejected because it's by design: http://legacy.python.org/dev/peps/pep-3126/ – Barmar Jun 02 '17 at 01:34
  • 2
    Pylint has a warning for some cases where this language feature is error-prone now: `implicit-str-concat-in-sequence`. Available since Pylint 2.2: http://pylint.pycqa.org/en/stable/whatsnew/2.2.html – Duijf Apr 23 '19 at 20:03

6 Answers6

72

From the docs:

Multiple adjacent string literals (delimited by whitespace), possibly using different quoting conventions, are allowed, and their meaning is the same as their concatenation. Thus, "hello" 'world' is equivalent to "helloworld".


Statement 3 doesn't work because:

The ‘+’ operator must be used to concatenate string expressions at run time.

Notice that the title of the subheader in the docs is "string literal concatenation" too. This only works for string literals, not other objects.


There's probably no difference. If there is, it's probably extremely tiny and nothing that anyone should worry about.


Also, understand that there can be dangers to this:

>>> def foo(bar, baz=None):
...     return bar
... 
>>> foo("bob"
... "bill")
'bobbill'

This is a perfect example of where Errors should never pass silently. What if I wanted "bill" to be the argument baz? I have forgotton a comma, but no error is raised. Instead, concatenation has taken place.

TerryA
  • 58,805
  • 11
  • 114
  • 143
  • 8
    So "Errors should never pass silently" in this case means that the grammar should be defined such that if a comma is removed from any valid program, the resulting program is either invalid or else has the same meaning as the original program? I mean syntactically significant comma, ofc, the rule shouldn't apply to removing a comma from inside a string literal ;-) – Steve Jessop Sep 17 '13 at 08:39
  • 1
    @SteveJessop Hmm, I guess it doesn't suit this example, doesn't it. Because technically there is no error, right? – TerryA Sep 17 '13 at 08:41
  • 1
    Well, as a language designer if you think "leaving out a comma" is a common typo, then it's reasonable to decide that you will not create situations where leaving out a comma changes a valid program to another valid program with different meaning. Your example code is an error if the programmer intended a comma and not if the programmer didn't, the issue is that the language is defined such that the interpreter can't tell which. There are other cases in Python where that happens, for example `(0) != (0,)`, and each one has its reasons that GvR considered more important. – Steve Jessop Sep 17 '13 at 08:43
  • @SteveJessop Ah, I get you. Should I change my answer to something else? Perhaps just remove the sentence? – TerryA Sep 17 '13 at 08:47
  • 3
    I think your answer is interesting with the sentence. I suppose that someone who takes it as a criticism of the design of Python might come back harshly on it, and/or consider it subjective. But it's true that this can be a gotcha, so Python users need to be aware of it and recognise the symptoms when it happens. The same applies in C and C++. – Steve Jessop Sep 17 '13 at 08:49
  • 4
    Another example applying to both Python and C (even though the comma doesn't mean the same thing): `1 - 1` and `1 , - 1` are different. Basically, context-sensitive operators lead to such examples, and juxtaposition in effect turns token boundaries into context-sensitive operators. String concatenation is a particularly easy mistake to make since it's pretty common to introduce linebreaks into comma-separated lists of strings. If you wrote `foo(1, -1)` then you'd have in effect the same lack of ability of the language to tell when you leave off the trailing comma. – Steve Jessop Sep 17 '13 at 08:54
  • 3
    @SteveJessop Wow, that's amazing. I've attempted to learn C++ before, but I never got into it because I love python (especially its syntax) so much, but you have definitely taught me a lot here. Thank you! – TerryA Sep 17 '13 at 09:00
  • 3
    This is a terrible "feature" of python. My program just silently failed because I missed a comma when typing out the long list of names for a pandas header. – sheridp Nov 08 '16 at 23:13
  • Interestingly in 2.7 and 3.6 it appears that "delimited by whitespace" is not needed as both `"a""b"` and `"a"'b'` work fine to give `"ab"`. It's interesting GvR chose to copy this behaviour from C etc. as it's only a single `+` character to make it explicit. – AJP May 06 '17 at 09:13
  • It's a single character, but it would require the parser to make semantic decisions (e.g., `"a" + "b" == "ab"`, but `"a" - "b"` is a runtime error). Juxtaposition can be handled easily by the parser itself, by merging two tokens lexically. – chepner Feb 05 '20 at 17:38
6

This is implicit string literal concatenation. It only happens with string literals, not variables or other expressions that evaluate to strings. There used to be a (tiny) performance difference, but these days, the peephole optimizer should render the forms essentially equivalent.

user2357112
  • 260,549
  • 28
  • 431
  • 505
4

To answer your second question: There is no difference at all (at least with the implementation I use). Disassembling both statements, they are rendered as LOAD_CONST STORE_FAST. They are equivalent.

Hyperboreus
  • 31,997
  • 9
  • 47
  • 87
4

You can use %s as this is more efficient than using + sign.

>>> string2 = "%s %s" %('Hello', 'World')
>>> string2
'Hello World'

(OR)


one more method is .format

>>> string2 = "{0} {1}".format("Hello", "World")
>>> string2
'Hello World'
>>> 
SuperNova
  • 25,512
  • 7
  • 93
  • 64
1

Statement 3 doesn't work as when you concatenate two string expressions to create a new string you need a '+' operator.

whereas in case of sting 1,2 and 4, adjacent literals separated by white spaces use different quoting conventions.Hence they are allowed making them print same as their concatenation.

also, there won't be any significant or noticeable time difference in running those 2 operations.

%%timeit -n 1
s1='ab'
s2='ba'
print(s1+s2)

o/p The slowest run took 17.08 times longer than the fastest. This could mean that an intermediate result is being cached. 57.8 µs ± 92.5 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

%%timeit -n 1
s3='ab' 'ba'
print(s3)

o/p The slowest run took 4.86 times longer than the fastest. This could mean that an intermediate result is being cached. 25.7 µs ± 21 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

shray
  • 11
  • 2
1

Why statement 3 does not work while statement 1 does?

Because, in the first statement, we are assigning some constant to a variable. Variable assignment is simple enough such that we can keep on putting multiple constants to a single variable and the assignment will still go through. The terms "hello" and "world" are two constants of same type. So, the statement worked.

If we do the following, we will get SyntaxError

string1 = "Hello" 1

The reason is that we supplied multiple constants in a single variable assignment. This confused python and it thrown it out as an error.

The statement 3 is all about assigning a variable based on two variables. This will produce SyntaxError as python don't know what it can do with 2 variables before assigning it to the variable.

Is there any technical difference such as calculation speed etc. between statement 1 and 2?

Yes. The only technical difference is readability rather than anything else. Readability matters most in Python. For an untrained eye, "hello" "world" might look like the compiler would add the space to the strings. Which is not the case.

However,

"hello" + "world"

is explicit and normal. Nearly always, Explicit is better than implicit.

thiruvenkadam
  • 4,170
  • 4
  • 27
  • 26