Why is string's startswith slower than in?

Question

Surprisingly, I find startswith is slower than in:

In [10]: s="ABCD"*10

In [11]: %timeit s.startswith("XYZ")
1000000 loops, best of 3: 307 ns per loop

In [12]: %timeit "XYZ" in s
10000000 loops, best of 3: 81.7 ns per loop

As we all know, the in operation needs to search the whole string and startswith just needs to check the first few characters, so startswith should be more efficient.

When s is big enough, startswith is faster:

In [13]: s="ABCD"*200

In [14]: %timeit s.startswith("XYZ")
1000000 loops, best of 3: 306 ns per loop

In [15]: %timeit "XYZ" in s
1000000 loops, best of 3: 666 ns per loop

So it seems that calling startswith has some overhead which makes it slower when the string is small.

And than I tried to figure out what's the overhead of the startswith call.

First, I used an f variable to reduce the cost of the dot operation - as mentioned in this answer - here we can see startswith is still slower:

In [16]: f=s.startswith

In [17]: %timeit f("XYZ")
1000000 loops, best of 3: 270 ns per loop

Further, I tested the cost of an empty function call:

In [18]: def func(a): pass

In [19]: %timeit func("XYZ")
10000000 loops, best of 3: 106 ns per loop

Regardless of the cost of the dot operation and function call, the time of startswith is about (270-106)=164ns, but the in operation takes only 81.7ns. It seems there are still some overheads for startswith, what's that?

Add the test result between startswith and __contains__ as suggested by poke and lvc:

In [28]: %timeit s.startswith("XYZ")
1000000 loops, best of 3: 314 ns per loop

In [29]: %timeit s.__contains__("XYZ")
1000000 loops, best of 3: 192 ns per loop

To get a more similar result, you can do `s.__contains__("XYZ")` as this will take the same route as `s.startswith("XYZ")` then (using the `in` operator will short-cut the member access). However, `startswith` is still slower for me then. — poke, Aug 10 '15 at 10:40
I think the remainder of the performance difference is due to `__contains__` being fully typed *in C*, while `startswith` does actual argument parsing and stuff (you can also pass a tuple). — poke, Aug 10 '15 at 10:43
What version of Python are you using? On 3.4.3, I get `s.startswith("XYZ")` report 153ns, and `s.__contains__("XYZ")` reports 169ns. As @poke says, using `in` will use completely different lookup rules than the method call - it can be looked up directly from a function pointer at the C level, while the method lookup does two dictionary searches and *then* has to do a Python-level function call. Timing those things separately can give you *some* idea of the difference, but isn't necessarily exact. On your numbers, subtracting both those overheads makes the time for `startswith` *negative*! — lvc, Aug 10 '15 at 10:48
I went even further and checked `%timeit "XYZ" == s[0:3]` which gives me `10000000 loops, best of 3: 94 ns per loop` while `%timeit "XYZ" in s` `10000000 loops, best of 3: 59.2 ns per loop`. Tested with python 3.4.3. (it seems that in my case, slicing gives "some" overhead, as `%timeit "XYZ" in s[0:3]` results in `10000000 loops, best of 3: 101 ns per loop`) — Marandil, Aug 10 '15 at 10:52
@lvc I'm using CPython 2.7.5. I did a test between `startswith` and `__contains__` as poke and you suggested, `startswith` is still slower, the same the result as poke said. — WKPlus, Aug 10 '15 at 11:47
Actually `in` doesn't need to search the whole string if it finds the argument early on. I'd expect these two calls to perform basically the same here. — Lightness Races in Orbit, Aug 10 '15 at 18:09
@LightnessRacesinOrbit While that’s true, `s` is a repetition of the string `"ABCD"`, so the whole string has to be searched in order to come to the conclusion that `"XYZ"` is not contained within. — poke, Aug 10 '15 at 21:23

poke · Accepted Answer · 2015-08-10T11:01:38.330

As already mentioned in the comments, if you use s.__contains__("XYZ") you get a result that is more similar to s.startswith("XYZ") because it needs to take the same route: Member lookup on the string object, followed by a function call. This is usually somewhat expensive (not enough that you should worry about of course). On the other hand, when you do "XYZ" in s, the parser interprets the operator and can short-cut the member access to the __contains__ (or rather the implementation behind it, because __contains__ itself is just one way to access the implementation).

You can get an idea about this by looking at the bytecode:

>>> dis.dis('"XYZ" in s')
  1           0 LOAD_CONST               0 ('XYZ')
              3 LOAD_NAME                0 (s)
              6 COMPARE_OP               6 (in)
              9 RETURN_VALUE
>>> dis.dis('s.__contains__("XYZ")')
  1           0 LOAD_NAME                0 (s)
              3 LOAD_ATTR                1 (__contains__)
              6 LOAD_CONST               0 ('XYZ')
              9 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             12 RETURN_VALUE

So comparing s.__contains__("XYZ") with s.startswith("XYZ") will produce a more similar result, however for your example string s, the startswith will still be slower.

To get to that, you could check the implementation of both. Interesting to see for the contains implementation is that it is statically typed, and just assumes that the argument is a unicode object itself. So this is quite efficient.

The startswith implementation however is a “dynamic” Python method which requires the implementation to actually parse the arguments. startswith also supports a tuple as an argument, which makes the whole start-up of the method a bit slower: (shortened by me, with my comments):

static PyObject * unicode_startswith(PyObject *self, PyObject *args)
{
    // argument parsing
    PyObject *subobj;
    PyObject *substring;
    Py_ssize_t start = 0;
    Py_ssize_t end = PY_SSIZE_T_MAX;
    int result;
    if (!stringlib_parse_args_finds("startswith", args, &subobj, &start, &end))
        return NULL;

    // tuple handling
    if (PyTuple_Check(subobj)) {}

    // unicode conversion
    substring = PyUnicode_FromObject(subobj);
    if (substring == NULL) {}

    // actual implementation
    result = tailmatch(self, substring, start, end, -1);
    Py_DECREF(substring);
    if (result == -1)
        return NULL;
    return PyBool_FromLong(result);
}

This is likely a big reason why startswith is slower for strings for which a contains is fast because of its simplicity.

@mounaim Apparently, Python’s Mercurial server just went down for a few seconds. It’s working again now. — poke, Aug 10 '15 at 10:59

score 1 · Answer 2 · edited Aug 10 '15 at 18:10

1

This is likely because str.startswith() does more than str.__contains__(), and also because I believe str.__contains__ operates fully in C, whereas str.startswith() has to interact with Python types. Its signature is str.startswith(prefix[, start[, end]]), where prefix can be a tuple of strings to try.

edited Aug 10 '15 at 18:10

Lightness Races in Orbit

378,754
76
643
1,055

answered Aug 10 '15 at 10:49

Cyphase

11,502
2
31
32

1

"Does more" is vague and probably not true. Finding a substring efficiently is not a trivial problem. [List of algorithms.](https://en.wikipedia.org/wiki/String_searching_algorithm#Single_pattern_algorithms). In contrast, finding a prefix is quite easy and in general faster. – Paul Draper Aug 10 '15 at 18:36
You're right @PaulDraper; I was up late when I wrote that :P. And thanks for the edit @LightnessRacesinOrbit :). – Cyphase Aug 10 '15 at 20:17

Why is string's startswith slower than in?

2 Answers2

Linked

Related