10

I have a Polars DataFrame with a list column. I want to control how many elements of a pl.List column are printed.

I've tried pl.pl.Config.set_fmt_str_lengths() but this only restricts the number of elements if set to a small value, it doesn't show more elements for a large value.

I'm working in Jupyterlab but I think it's a general issue.

import polars as pl

N = 5
df = (
    pl.DataFrame(
        {
            'id': range(N)
        }
    )
    .with_row_count("value")
    .groupby_rolling(
        "id",period=f"{N}i"
    )
    .agg(
        pl.col("value")
    )
)
df
shape: (5, 2)
┌─────┬───────────────┐
│ id  ┆ value         │
│ --- ┆ ---           │
│ i64 ┆ list[u32]     │
╞═════╪═══════════════╡
│ 0   ┆ [0]           │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 1   ┆ [0, 1]        │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2   ┆ [0, 1, 2]     │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 3   ┆ [0, 1, ... 3] │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 4   ┆ [0, 1, ... 4] │
└─────┴───────────────┘
braaannigan
  • 594
  • 4
  • 12

4 Answers4

8

pl.Config.set_tbl_rows(100)

And more generally, I would try looking at dir(pl.Config)

JohnRos
  • 1,091
  • 2
  • 10
  • 20
6

You can use the following config parameter from the Polars Documentation to set the length of the output e.g. 100.

import Polars as pl

pl.Config.set_fmt_str_lengths(100)
Daniel
  • 400
  • 2
  • 10
  • 1
    This doesn't have any impact for me - can you confirm what kind of setup it worked for you on e.g. Jupyter, ipython, VSCode? – braaannigan Nov 14 '22 at 16:42
0

Currently I do not think you can, directly; the documentation for Config does not list any such method, and for me (in VSCode at least) set_fmt_str_lengths does not affect lists.

However, if your goal is simply to be able to see what you're working on and you don't mind a slightly hacky workaround, you can simply add a column next to it where you convert your list to a string representation of itself, at which point pl.Config.set_fmt_str_lengths(<some large n>) will then display however much of it you like. For example:

import polars as pl
pl.Config.set_fmt_str_lengths(100)

N = 5
df = (
    pl.DataFrame(
        {
            'id': range(N)
        }
    )
    .with_row_count("value")
    .groupby_rolling(
        "id",period=f"{N}i"
    )
    .agg(
        pl.col("value")
    ).with_column(
        pl.col("value").apply(lambda x: ["["+", ".join([f'{i}' for i in x])+"]"][0]).alias("string_repr")
    )
)
df
shape: (5, 3)
┌─────┬───────────────┬─────────────────┐
│ id  ┆ value         ┆ string_repr     │
│ --- ┆ ---           ┆ ---             │
│ i64 ┆ list[u32]     ┆ str             │
╞═════╪═══════════════╪═════════════════╡
│ 0   ┆ [0]           ┆ [0]             │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 1   ┆ [0, 1]        ┆ [0, 1]          │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2   ┆ [0, 1, 2]     ┆ [0, 1, 2]       │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 3   ┆ [0, 1, ... 3] ┆ [0, 1, 2, 3]    │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 4   ┆ [0, 1, ... 4] ┆ [0, 1, 2, 3, 4] │
└─────┴───────────────┴─────────────────┘
  • 1
    Thanks - I'm digging around in the source code to understand if there's a fundamental solution, I'll update when I know more – braaannigan Dec 16 '22 at 09:51
0

Whilst it seems you can't do it directly, you can use to_numpy as a quick way to see data, and then you have the numpy rendering machinery.

>>> af
shape: (7, 1)
┌─────────────┐
│ all         │
│ ---         │
│ list[i32]   │
╞═════════════╡
│ []          │
│ [1, 2, … 3] │
│ []          │
│ [1, 2, … 3] │
│ []          │
│ []          │
│ []          │
└─────────────┘
>>> af["all"].to_numpy()
array([array([], dtype=int32),
       array([ 1,  2,  2,  3, 17,  9,  1,  3], dtype=int32),
       array([], dtype=int32),
       array([ 1,  2,  3,  3,  2,  1, 17, 12,  9,  1,  2,  3], dtype=int32),
       array([], dtype=int32), array([], dtype=int32),
       array([], dtype=int32)], dtype=object)

Paul Rudin
  • 17
  • 1
  • 7