5

With python3.8, a new feature is self documenting format strings. Where one would normally do this:

>>> x = 10.583005244
>>> print(f"x={x}")
x=10.583005244

One can now do this, with less repetition:

>>> x = 10.583005244
>>> print(f"{x=}")
x=10.583005244

This works very well for one line string representations. But consider the following scenario:

>>> import numpy as np
>>> some_fairly_long_named_arr = np.random.rand(4,2)
>>> print(f"{some_fairly_long_named_arr=}")
some_fairly_long_named_arr=array([[0.05281443, 0.06559171],
       [0.13017109, 0.69505908],
       [0.60807431, 0.58159127],
       [0.92113252, 0.4950851 ]])

Here, the first line does not get aligned, which is (arguably) not desirable. I would rather prefer the output of the following:

>>> print(f"some_fairly_long_named_arr=\n{some_fairly_long_named_arr!r}")
some_fairly_long_named_arr=
array([[0.05281443, 0.06559171],
       [0.13017109, 0.69505908],
       [0.60807431, 0.58159127],
       [0.92113252, 0.4950851 ]])

Here, the first line of the output is aligned as well, but it defeats the purpose of not repeating the variable name twice in the print statement.

The example is a numpy array, but it could have been a pandas dataframe etc. as well.

Hence, my question is: Can a newline character be inserted after the = sign in self documenting strings?

I tried to add it like this, but it does not work:

>>> print(f"{some_fairly_long_named_arr=\n}")
SyntaxError: f-string expression part cannot include a backslash

I read the docs on format-specification-mini-language, but most of the formatting there only works for simple data types like integers, and I was not able to achieve what I wanted using those that work.

wjandrea
  • 28,235
  • 9
  • 60
  • 81
Enigma Machine
  • 123
  • 1
  • 7
  • 2
    Try checking the relevant PEP or issue. If it's not there, the answer is probably not. Then check the CPython source. If it's not there, the answer is definitely not. – Mateen Ulhaq May 09 '20 at 10:19
  • The [relevant issue](https://bugs.python.org/issue36817) does not seem to contain my use case. I will check the [CPython source](https://github.com/python/cpython/commit/9a4135e939bc223f592045a38e0f927ba170da32) at a more convenient time. – Enigma Machine May 10 '20 at 07:54

2 Answers2

0

Wouldn't recommend this at all, but for possibility's sake:

import numpy as np

_old_array2string = np.core.arrayprint._array2string

def _array2_nice_string(*args, **kwargs):
    non_nice_string = _old_array2string(*args, **kwargs)
    dimension_strings = non_nice_string.split("\n")
    if len(dimension_strings) > 1:
        dimension_string = dimension_strings[1]
        dimension_indent = len(dimension_string) - len(dimension_string.lstrip())
        return "\n" + " " * dimension_indent + non_nice_string
    return non_nice_string

np.core.arrayprint._array2string = _array2_nice_string

Outputs for:

some_fairly_long_named_arr = np.random.rand(2, 2)
print(f"{some_fairly_long_named_arr=}")
some_fairly_long_named_arr=array(
       [[0.95900608, 0.79367873],
       [0.58616975, 0.17757661]])

and

some_fairly_long_named_arr = np.random.rand(1, 2)
print(f"{some_fairly_long_named_arr=}")

some_fairly_long_named_arr=array([[0.62492772, 0.80453153]]).

I made it so if if the first dimension is 1, it is kept on the same line.

There is a non-internal method np.array2string that I tried to re-assign, but I never got that working. If someone could find a way to re-assign that public function instead of this internally used one, I'd imagine that'd make this solution a lot cleaner.

wjandrea
  • 28,235
  • 9
  • 60
  • 81
Mario Ishac
  • 5,060
  • 3
  • 21
  • 52
  • 1
    My god, this is far more complex than I thought it would be :) However, I would rather prefer a solution that works for many data types in general: numpy arrays, dataframes, etc... I will edit the question. – Enigma Machine May 09 '20 at 11:32
  • That indentation is not correct so the output not properly aligned. I think you actually want something more like `dimension_indent = len('array(')`. – wjandrea May 24 '23 at 18:49
0

Multi-line solution

I figured out a way to accomplish what I wanted, after reading through the CPython source:

import numpy as np
some_fairly_long_named_arr = np.random.rand(4, 2)
print(f"""{some_fairly_long_named_arr =
}""")

Which produces:

some_fairly_long_named_arr = 
array([[0.23560777, 0.96297907],
       [0.18882751, 0.40712246],
       [0.61351814, 0.1981144 ],
       [0.27115495, 0.72303859]])

I would rather prefer a solution that worked in a single line, but this seems to be the only way for now. Perhaps another way will be implemented in a later python version.

However note that the indentation on the continuation line has to be removed for the above mentioned method, as such:

    # ...some code with indentation...
    print(f"""{some_fairly_long_named_arr =
}""")
    # ...more code with indentation...

Otherwise, the alignment of the first line is broken again.

I tried using inspect.cleandoc and textwrap.dedent to alleviate this, but could not manage to fix the indentation issue. But perhaps this is the subject of another question.

Single-line solution

I found this after reading this article:

f_str_nl = lambda object: f"{chr(10) + str(object)}"  # add \n directly
# f_str_nl = lambda object: f"{os.linesep + str(object)}"  # add \r\n on windows

print(f"{f_str_nl(some_fairly_long_named_arr) = !s}")

which outputs:

f_str_nl(some_fairly_long_named_arr) = 
[[0.26616956 0.59973262]
 [0.86601261 0.10119292]
 [0.94125617 0.9318651 ]
 [0.10401072 0.66893025]]

The only caveat is that the name of the object gets prepended by the name of the custom lambda function, f_str_nl.

I also found that a similar question was already asked here.

wjandrea
  • 28,235
  • 9
  • 60
  • 81
Enigma Machine
  • 123
  • 1
  • 7