0

I'm creating a class for a custom data structure, and I'd like it to mimic the behavior of a pandas dataframe object. I want to visualize my object in a way that Pandas does it - by simply printing the object itself. When you create a pandas dataframe object, it can be easily printed with nice formatting:

>>> import pandas as pd
>>> df = pd.DataFrame([1,2,3],[1,2,3])
>>> print(df)

   0
1  1
2  2
3  3

I assume this implicitly accesses some attribute or calls some method, which prints said attribute somehow. I know that a __str__ method exists, which is able to return a string when called, however pd.DataFrame does not seem to implement this method, or at least I wasn't able to find it in the source code - how does it do it then?

If I were to use __str__, I could of course concatenate my data structure's rows as strings, into a long string with new lines and return it. I can also return an empty string (since __str__ cannot return None), and print my data structure in another way inside of __str__.

Both of those solutions do work, but they both seem kind of weird/counterintuitive. Perhaps there is a better way to do this, similar to how libraries such as Pandas handle it? Is there a known standard or best practice in this regard?

EDIT:

Possibly related question - how does the same process happen inside Jupyter Notebooks, when df is called? Since it doesn't simply print it, but instead display it in a nicely formatted way - how would I define such behavior for my object?

DMSBrian
  • 26
  • 5
  • Building a string and return it is the perfect way to do it , what's the problem so ? – azro Mar 20 '23 at 20:57
  • 1
    `pd.DataFrame` defines [`__repr__`](https://github.com/pandas-dev/pandas/blob/532ed6f50ad04829a62a75939586aa9048573898/pandas/core/frame.py#L1094), not `__str__`. – 0x5453 Mar 20 '23 at 20:57
  • @azro No problem per se, i just noticed that this is not how pandas handles it and was wondering if there's an established best-practice here, since i've never dealt with this sort of thing before. – DMSBrian Mar 20 '23 at 20:59
  • @0x5453, yeah it does, but i thought `__repr__` is only used when a repr() function is called on an object? perhaps i was wrong though – DMSBrian Mar 20 '23 at 21:00
  • 2
    `__repr__` is also the fallback used by `object.__str__` if no other `__str__` is defined. I suspect the rationale for `Dataframe` is that data frames are primarily meant to be *queried*, not displayed, so `__repr__` is used to provide a visual representation for debugging. – chepner Mar 20 '23 at 21:01
  • @chepner Ooh, I had no idea it works like that. Thanks a lot guys! – DMSBrian Mar 20 '23 at 21:03
  • @chepner Do you want to post it as an answer so that I can accept it? – DMSBrian Mar 20 '23 at 21:04

1 Answers1

1

Answer compiled from comments by @chepner, @0x5453 and @azro:

Building a string and returning it is the intended way to do this.

The way pd.DataFrame handles it, is by instead defining the __repr__ method. This is possible, because __repr__ is used as a fallback for __str__, if __str__ is not explicitly defined.

Answer to EDIT:

When it comes to visualization in IPython/Jupyter notebooks, a different method is used - _repr_html_.

DMSBrian
  • 26
  • 5
  • if any of you guys want to post an answer, I'll accept it. Otherwise I'll just accept this in 2 days. Maybe this is common knowledge but somehow I didn't encounter this before so I guess it might help someone. – DMSBrian Mar 20 '23 at 21:24
  • 1
    See also: [What is the difference between \_\_str__ and \_\_repr__?](https://stackoverflow.com/q/1436703/3282436) and the official docs for [`__repr__`](https://docs.python.org/3/reference/datamodel.html#object.__repr__) and [`__str__`](https://docs.python.org/3/reference/datamodel.html#object.__str__) – 0x5453 Mar 20 '23 at 21:34
  • 1
    I'll tag this with the documentation for integrating with IPython/Python-based Jupyter: [representations of objects in IPython](https://ipython.readthedocs.io/en/stable/config/integrating.html#custom-methods) . And a customization example outline with code is [here](https://discourse.jupyter.org/t/how-to-write-a-class-which-naturally-produce-a-latex-output-in-jupyter-notebook/17437/3?u=fomightez). – Wayne Mar 21 '23 at 17:30