5

This is somewhat of a broad topic, but I will try to pare it to some specific questions.

I was thinking about a certain ~meta~ property in Python where the console representation of many basic datatypes are equivalent to the code used to construct those objects:

l = [1,2,3]
d = {'a':1,'b':2,'c':3}
s = {1,2,3}
t = (1,2,3)
g = "123"

###

>>> l
[1, 2, 3]
>>> d
{'a': 1, 'b': 2, 'c': 3}
>>> s
{1, 2, 3}
>>> t
(1, 2, 3)
>>> g
'123'

So for any of these objects, I could copy the console output into the code to create those structures or assign them to variables.

This doesn't apply to some objects, like functions:

def foo():
    pass

f = foo
L = [1,2,3, foo]

###

>>> f
<function foo at 0x00000235950347B8>
>>> L
[1, 2, 3, <function foo at 0x00000235950347B8>]

While the list l above had this property, the list L here does not; but this seems to be only b/c L contains an element which doesn't hold this property. So it seems to me that generally, list has this property in some way.

This applies to some objects in non-standard libraries as well:

import numpy as np
a = np.array([1,2,3])

import pandas as pd
dr = pd.date_range('01-01-2020','01-02-2020', freq='3H')

###

>>> a
array([1, 2, 3])
>>> dr
DatetimeIndex(['2020-01-01 00:00:00', '2020-01-01 03:00:00',
               '2020-01-01 06:00:00', '2020-01-01 09:00:00',
               '2020-01-01 12:00:00', '2020-01-01 15:00:00',
               '2020-01-01 18:00:00', '2020-01-01 21:00:00',
               '2020-01-02 00:00:00'],
              dtype='datetime64[ns]', freq='3H')

For the numpy array, the console output matches the code used, provided you have array in the namespace. For the pandas.date_range, it's a little bit different because the console output can construct the same object produced created by dr = pd.date_range('01-01-2020','01-02-2020', freq='3H'), but with different code.

Interesting Example

A DataFrame doesn't hold this property, however using the to_dict() method converts it into a structure which does hold this property:

import pandas as pd
df = pd.DataFrame({'A':[1,2,3],
                   'B':[4,5,6]})

###

>>> df
   A  B
0  1  4
1  2  5
2  3  6
>>> df.to_dict()
{'A': {0: 1, 1: 2, 2: 3}, 'B': {0: 4, 1: 5, 2: 6}}
>>> pd.DataFrame.from_dict({'A': {0: 1, 1: 2, 2: 3}, 'B': {0: 4, 1: 5, 2: 6}})
   A  B
0  1  4
1  2  5
2  3  6

An example scenario where this is useful is.....posting on SO! B/c you can convert your DataFrame into a data structure where the text representation can be used to construct that data structure. So if you share the to_dict() version of your DataFrame with someone, they are getting Python-syntaxed code which can be used to recreate the structure. I have found this to be advantageous over pd.read_clipboard() in some situations.


My questions based on the above:

Mainly:

  • Is there a name for this "property" (given it is a real intentional "property" of objects in Python?)

Additionally (these are less concretely answerable, I recognize, and can remove if off-topic):

  • Is it something unique to Python or does it hold true in other languages?
  • Are there other basic Python structures for which this property holds or doesn't hold?

I apologize if this is common knowledge to people, or if I am making a mountain out of a molehill here!

Tom
  • 8,310
  • 2
  • 16
  • 36
  • 3
    What the console representation of an object is, depends on the way its `__repr__()` method is written. So I think most of us would at least understand if you talked about this "property" as the `repr` of the object. The method has to return a string but the string's contents are up to the author, so it's impossible to say in general whether the `repr` of an object is the same as the code needed to create it. In some cases (such as functions) the code might be too long to be useful. In others (such as recursive structures) there might be no reasonable linear representation. – BoarGules Jul 05 '20 at 19:00
  • @BoarGules any reason you posted this as a comment, rather than as an answer (which I think it is) ;-) ? – Asmus Jul 06 '20 at 12:12
  • @BoarGules Thanks for the explanation! That certainly makes the "property" I am alluding to less vague (the `repr` of the object is Python code for the object). I got worked up by this DataFrames example, but as you indicate it seems like one-to-one representations just make the most sense for simple structures (and what else would they be?). If you do answer, I will accept. – Tom Jul 06 '20 at 14:19

2 Answers2

1

What the console representation of an object is, depends on the way its __repr__() method is written. So I think most of us would at least understand if you talked about this "property" as the repr of the object. The method has to return a string but the string's contents are up to the author, so it's impossible to say in general whether the repr of an object is the same as the code needed to create it. In some cases (such as functions) the code might be too long to be useful. In others (such as recursive structures) there might be no reasonable linear representation.

Reposted as an answer instead of a comment in response to suggestions by participants in the comment thread.

BoarGules
  • 16,440
  • 2
  • 27
  • 44
0

I happened to come across some information related to this (a year and a half later). An interesting passage in this article asserts that:

The default objective of __repr__ is to have a string representation of the object from which object can be formed again using Python’s eval such that below holds true: object = eval(repr(object))

object == eval(repr(object)) (!!!)

In Python code this neatly states the concept I was grasping at. An object can be constructed from its console representation.

From Googling that Python phrase, I came across more Stack resources, including the canonical post on the difference between __str__ and __repr__. But particularly relevant here was this answer which highlights how this concept is discussed in the Python documentation for __repr__. Particularly, there is the recommendation that:

If at all possible, this should look like a valid Python expression that could be used to recreate an object with the same value (given an appropriate environment).

Furthermore, the __repr__ of an object is meant to be clear and unambiguous such that if you inspect the object, you know exactly what is. And having object == eval(repr(object)) is one way of achieving this.

So in regard to my initial questions:

  • Is there a name for this "property"? Not really, but object == eval(repr(object)) is a succinct way of stating it. And the Python docs have their own way of stating it.
  • Is it something unique to Python or does it hold true in other languages? I don't really know if it is unique, but it certainly is an encouraged part of Python! But its main intention is for developing/unambiguity, rather than sharing/reproducing code.
Tom
  • 8,310
  • 2
  • 16
  • 36