Compute len(str(x)) and len(repr(x)) efficiently

Question

Is there a way to compute the result of len(str(x)) and/or len(repr(x)) without first computing str(x) or repr(x)? How?

I may wish to conditionally print an object based on its printed length, so I don't need str(x) or repr(x) if I'm not going print it out.

# How might we redefine lenstr so that str(x) need not be computed?
def lenstr(x):
    return len(str(x))

def maybe_str(x, maxlen=80, subst='*'):
    return subst * maxlen if lenstr(x) > maxlen else str(x)

maybe_str(tuple(range(9)))
# '(0, 1, 2, 3, 4, 5, 6, 7, 8)'
maybe_str(tuple(range(99)))
# '********************************************************************************'

maybe_str(tuple(map(lambda k: tuple(range(k)), range(4))))
# '((), (0,), (0, 1), (0, 1, 2))'
maybe_str(tuple(map(lambda k: tuple(range(k)), range(42))))
'********************************************************************************'

If you've defined an object/class that has some "length" whatever that length may be. Instead of having to compute the length of this object each time you want to know what it is, store a variable within the object/class that updates as you increase its length property over its lifetime and use. You can then reference this variable in your conditional statements. — Bdyce, Aug 26 '20 at 19:49
@Bdyce That solution would require every object to have such a method, and would not work for the built-in objects, like `tuple`. — Ana Nimbus, Aug 26 '20 at 23:21
There's no good way to do this in general, and indeed, even with built-in objects that have a well-defined length, the length of objects they refer would have to be recursively interrogated, so can you narrow this down somewhat? — juanpa.arrivillaga, Aug 27 '20 at 00:03

score 1 · Answer 1 · answered Aug 26 '20 at 23:45

In general? No. For specific known implementations of __str__ and __repr__, you could keep some information about what the length would be without actually computing the strings. The trivial case would be if you know __str__ = lambda: "", then you also know that len(str(x)) is 0 without "computing" it.

If you are attempting to avoid an expensive computation and __str__ and __repr__ are getting called repeatedly, you could set up a memoized function, such that the actual implementations are replaced by a function that calls the implementation once, stores the value, then returns that stored value for all subsequent calls.

You could also make a function that implements the logic of __str__ and __repr__ with an argument like maxlen that aborts the computation if the buffer for the eventual output exceeds that length. Then define the real magic functions on your objects as calling your custom aborting function with a value for "don't abort". In the places where you only want output under a particular length, use the custom function directly to exit early.

Doing this for all objects is possible, though inadvisable. Just like you can replace builtins, so can other code. Violating the contract of builtins is an excellent way to produce subtle and hard-to-track-down bugs. Save yourself the trouble, and don't do it.

RE "In general? No.": Python must (I assume) do partial computation of the eventual `str` under the hood, though. In order to create the `str` for a `tuple`, it must traverse the tuple's elements, computing the element's `str`'s, right? You are saying that the intermediate magic is not available in the Python programmer's interface? — Ana Nimbus, Aug 27 '20 at 00:13
RE "memoized": In my immediate application, I have a stack of objects that may get printed out many times, so a stack that memorizes the corresponding `str`s is not a bad idea. — Ana Nimbus, Aug 27 '20 at 00:15
The Python implementation of `str()` calls the "magic method" `__str__()` on any given object. The actual implementation of the magic method will vary depending on the author of the code. Some will do things like allocate a buffer and fill it up, while many implementations do some combination of string format and concatenation operations directly. For `tuple` specifically, the default implementation is in C: https://github.com/python/cpython/blob/master/Objects/tupleobject.c#L304 — wmorrell, Aug 27 '20 at 06:16

score 1 · Answer 2 · answered Aug 26 '20 at 23:56

1

You are talking about altering the functionality of native python objects. There are ways you can do this but it is highly warned against. If your idea revolves around simply saving some small amount of computational cost on getting the "length" property of an object I would look into extending these objects with your own custom wrapper classes and using those classes instead of the native python objects. These classes are where you would implement the logic to keep track of length during usage as to not have to compute the length each time.

This is how python is supposed to be used and most other object oriented programming languages. Get creative by defining your own data structures in order to have control over the computations you are describing by utilizing inheritance

answered Aug 26 '20 at 23:56

Bdyce

332
2
11

RE: "You are talking about altering the functionality of native python objects": not me. not intentionally, anyway. For my work, I much prefer a functional style, with the occasional class for very special cases. I would much rather define one `lenstr` function to handle any object (or at least all the built-ins, _if that were possible_ ) than to define `lenstr` methods for every object. – Ana Nimbus Aug 27 '20 at 00:23
RE: "Get creative": I will take to mean something along the lines of @wmorrel s memoization suggestion, which will work for the special case I have in mind. – Ana Nimbus Aug 27 '20 at 00:25

Compute len(str(x)) and len(repr(x)) efficiently

2 Answers2