3

Since python does slice-by-copy, slicing strings can be very costly.

I have a recursive algorithm that is operating on strings. Specifically, if a function is passed a string a, the function calls itself on a[1:] of the passed string. The hangup is that the strings are so long, the slice-by-copy mechanism is becoming a very costly way to remove the first character.

Is there a way to get around this, or do I need to rewrite the algorithm entirely?

stoksc
  • 324
  • 3
  • 12
  • You're totally right, I just misunderstood how memoryview worked and thought it was the same. Closing the question. – stoksc Mar 06 '18 at 02:22
  • Also, `a[:1]` is slicing out the first character (if any), which is incredibly cheap. Did you mean `a[1:]` (which would slice all *but* the first character)? – ShadowRanger Mar 06 '18 at 02:25
  • Yeah, I did. Thanks for the catch. – stoksc Mar 06 '18 at 02:27
  • @excaza: That assumes the OP is using Python 2 (or using Python 3 and the algorithm operates on `bytes` rather than `str`). `memoryview`s don't work on Py2 `unicode`/Py3 `str`. – ShadowRanger Mar 06 '18 at 02:27
  • If you showed us your algorithm, maybe we could show a better way that doesn't need such slicing... – Stefan Pochmann Mar 06 '18 at 02:37
  • Why not try a list of chars? List slices don't create copies. My concern would be overhead at that point. – sudo Mar 06 '18 at 02:43
  • 1
    @sudo: List slices create new `list`s; sure, the len 1 `str` objects wouldn't be copied, but the pointers to them would be, and the pointers are 4-8 bytes a piece, vs. 1-4 bytes a piece for each character in a string. The big-O cost of a `list` slice is identical to that of a `str` slice. For Python built-in types, about the only types with `O(1)` slicing are `memoryview` and (on Py3) `range`. `numpy` adds whole slew of view-like sequences, but it's not a built-in package. – ShadowRanger Mar 06 '18 at 02:50
  • Hmm yeah, not sure where I read that list slices don't create copies, but I seem to be wrong. Sorry. – sudo Mar 06 '18 at 03:04
  • @sudo What do you mean with list of "chars"? – Stefan Pochmann Mar 06 '18 at 03:04
  • @StefanPochmann I meant a list of characters. In Python, that would be a list of `str`. But that seems to create a copy, which makes sense. I don't understand why `str` slices create copies despite `str` being immutable, though, but there's probably an answer for that somewhere else. – sudo Mar 06 '18 at 03:06
  • @sudo I'd say that answer is in the accepted answer under the question the OP linked to. Btw I already assumed you meant list of (single-character) *strings* and did some tests... slicing a long string with [1:] over and over until the end was almost exactly 100 times faster than with your list idea. – Stefan Pochmann Mar 06 '18 at 03:08
  • @StefanPochmann I'm sure you could but it's not really about the algorithm, it's about the slicing. I can easily rewrite the algorithm without it; however, it will be at the cost of clarity. – stoksc Mar 06 '18 at 03:13
  • @lieblos Meh, I'm not convinced it'll be at the cost of clarity. Actually I doubt it. – Stefan Pochmann Mar 06 '18 at 03:18
  • @StefanPochmann Why's that? – stoksc Mar 06 '18 at 03:21
  • @lieblos Long strings and repeatedly removing the first character rather sounds like iteration, which could quite possibly be done with an iterator instead. Also, lots of experience. I even happen to have done that to solve some problem *today*. – Stefan Pochmann Mar 06 '18 at 03:24
  • @StefanPochmann It's just a variation on a longest common subsequence algorithm. In my opinion, it just writes better with recursion; that said, I'm partial to Scheme/Clojure so it may just be my opinion. I'll keep you in mind if I have a question in the future. I've finished this for now, though. – stoksc Mar 06 '18 at 03:34
  • @lieblos Huh? "better with recursion"? Did I say anything about not using recursion? – Stefan Pochmann Mar 06 '18 at 03:37
  • @lieblos We're even talking about the exact same problem, [aren't we?](https://leetcode.com/lieblos/) – Stefan Pochmann Mar 06 '18 at 03:45
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/166286/discussion-between-lieblos-and-stefan-pochmann). – stoksc Mar 06 '18 at 03:46
  • 1
    @sudo: `str` slices copy instead of creating views partially to keep the implementation simpler (the raw data can be allocated with the object header in a single block, with no need to have the data allocated separately with separate reference counts, and avoiding the need to store an offset into the data), and partially to avoid keepalive effects. If someone does something like `smallstr = mystr[1:11]` where `mystr` is 1 GB long, it would be ridiculous to keep `mystr` alive forever just because `smallstr` was looking at 10 characters of it. – ShadowRanger Mar 06 '18 at 04:55

1 Answers1

7

The only way to get around this in general is to make your algorithm uses bytes-like types, either Py2 str or Py3 bytes; views of Py2 unicode/Py3 str are not supported. I provided details on how to do this on my answer to a related question, but the short version is, if you can assume bytes-like arguments (or convert to them), wrapping the argument in a memoryview and slicing is a reasonable solution. Once converted to a memoryview, slicing produces new memoryviews with O(1) cost (in both time and memory), rather than the O(n) time/memory cost of text slicing.

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
  • n.b. `bytes` won't work unless the string is a fixed-length encoding like ASCII. For example, utf8 won't work, but you could easily convert to utf32 to get around this. – sudo Mar 06 '18 at 02:40
  • 1
    @sudo: Yeah, my other answer covers that issue. It's not quite as easy as just "convert to `utf-32` though"; UTF-32 is fixed length, but you either have to give up `bytes`-like behavior (e.g. on Py3 casting the `memoryview` to a four byte format) or keep it `bytes`-like but manually adjust for the larger character width each time (slicing off the first four bytes rather than the first character each time). It can be a pain. – ShadowRanger Mar 06 '18 at 02:48
  • Thanks, solid answer. I feel like it's really easy to get caught up with how easy operations are in Python and forget about their efficiency entirely. Curious, how did you learn all these big-O guarantees for Python operations? – stoksc Mar 06 '18 at 03:19
  • 1
    @lieblos: The language spec doesn't provide guarantees in most cases (the Python tutorial mentions that slicing a `list` shallow copies, but otherwise doesn't get into it). Stuff like `str` slices being copies vs. views isn't a language guarantee; at one point they considered making `str` slices views, but decided against it largely because of the implicit, non-intuitive keep-alive effect (slicing two characters out of a 1 GB `str` causing the 1 GB `str` to live until the slice is cleaned up). Mostly I just read the Python bug tracker, the PEPs, and the What's New pages for each release. – ShadowRanger Mar 06 '18 at 05:04
  • Mostly, they've just adopted a general rule that for Python built-ins, slices of sequences copy unless there is an explicit opt-in to view semantics (with the associated keep-alive effect) as in the case of `memoryview`. Only third party packages violate this rule (e.g. `numpy`), and they're doing so for high-performance computing purposes, not a design goal for the base language. – ShadowRanger Mar 06 '18 at 05:06
  • @ShadowRanger I'd never thought about the keep-alive effect - the behavior makes a lot of sense with that in mind. I never look at the PEPs. Maybe I'll start to get in the habit. Thanks for the help. – stoksc Mar 06 '18 at 05:11
  • I think the big-O for things like this depends on your Python interpreter, hence why the language spec won't mention it. Maybe look for CPython specifically if that's what you're using, like here: https://wiki.python.org/moin/TimeComplexity – sudo Mar 06 '18 at 18:33
  • 1
    @sudo: To an extent, yes. The requirements imposed on the various built-in types mean that you only have a few practical options, all of which shared the same big-O performance. Some of the big-Os are documented sort of by accident (e.g. `list` having `O(n)` memory movement costs for `pop`/`insert` on the left is documented as a detail of `collections.deque`, which exists to alleviate that problem, as is `list`'s `O(1)` random access, which `collections.deque` lacks). In practice, `list` is always like C++ `vector`, `set`/`dict` are hash based, etc. New solutions aren't invented often. – ShadowRanger Mar 06 '18 at 21:30