How to convert float to string with specific number of decimal places in Python polars?

Question

I have a polars DataFrame with multiple numeric (float dtype) columns. I want to write some of them to a csv with a certain number of decimal places. The number of decimal places I want is column-specific.

polars offers format:

import polars as pl

df = pl.DataFrame({"a": [1/3, 1/4, 1/7]})

df.select(
    [
        pl.format("as string {}", pl.col("a")),
        ]
    )

shape: (3, 1)
┌───────────────────────────────┐
│ literal                       │
│ ---                           │
│ str                           │
╞═══════════════════════════════╡
│ as string 0.3333333333333333  │
│ as string 0.25                │
│ as string 0.14285714285714285 │
└───────────────────────────────┘

However, if I try to set a directive to specify number of decimal places, it fails:

df.select(
    [
        pl.format("{:.3f}", pl.col("a")),
        ]
)

ValueError: number of placeholders should equal the number of arguments

Is there an option to have "real" f-string functionality without using an apply?

pl.__version__: '0.16.16'
related: Polars: switching between dtypes within a DataFrame
to set the decimal places of all output columns, pl.DataFrame.write_csv offers the float_precision keyword

score 2 · Answer 1 · answered Mar 30 '23 at 14:34

what about using round?

example:

df.select(
    [
        pl.format("as string {}", pl.col("a").round(3)),
        ]
    )

shape: (3, 1)
┌─────────────────┐
│ literal         │
│ ---             │
│ str             │
╞═════════════════╡
│ as string 0.333 │
│ as string 0.25  │
│ as string 0.143 │
└─────────────────┘

Looks like a good option if you don't care about the trailing zeros being removed. — FObersteiner, Mar 30 '23 at 14:57

alexander-beedie · Answer 2 · 2023-03-31T05:09:09.127

If the number of decimals was the same for all cols, float_precision on the write_csv method would be sufficient:

df = pl.DataFrame( {"colx": [1/3, 1/4, 1/7, 2]} )
print( df.write_csv( None,float_precision=3 ) )

# colx
# 0.333
# 0.250
# 0.143
# 2.000

Otherwise, you can use this (slightly ungainly) utility function to get the desired per-column "float → string" rounding behaviour (including trailing zeros - if you don't need the trailing zeros then stick with @Luca's "round" approach as it'll be more performant), and then export to CSV:

def round_str( col:str, n:int ):
    return ( 
        pl.col( col ).round( n ).cast( str ) + pl.lit( "0"*n ) 
    ).str.replace( rf"^(\d+\.\d{{{n}}}).*$","$1" ).alias( col )

Example:

df = pl.DataFrame(
    {
        "colx": [1/3, 1/4, 1/7, 2.00],
        "coly": [1/4, 1/5, 1/6, 1.00],
        "colz": [3/4, 7/8, 9/5, 0.09],
    }
).with_columns(
    round_str( "colx",5 ),
    round_str( "coly",3 ),
    round_str( "colz",1 ),
)
# ┌─────────┬───────┬──────┐
# │ colx    ┆ coly  ┆ colz │
# │ ---     ┆ ---   ┆ ---  │
# │ str     ┆ str   ┆ str  │
# ╞═════════╪═══════╪══════╡
# │ 0.33333 ┆ 0.250 ┆ 0.8  │
# │ 0.25000 ┆ 0.200 ┆ 0.9  │
# │ 0.14286 ┆ 0.167 ┆ 1.8  │
# │ 2.00000 ┆ 1.000 ┆ 0.1  │
# └─────────┴───────┴──────┘

print( df.write_csv(None) )

# colx,coly,colz
# 0.33333,0.250,0.8
# 0.25000,0.200,0.9
# 0.14286,0.167,1.8
# 2.00000,1.000,0.1

(Ideally the float_precision param on write_csv would allow a dict; something for the TODO list ;)

Thanks for looking into this! I've been using `float_precision` so far, but now the requirement arose to have that different by column, essentially to provide precision information for the data. I know, there are better ways to do that... Anyhow, I'd imagine that having a configurable float_precision kwarg could be helpful for others as well. — FObersteiner, Mar 31 '23 at 06:17

How to convert float to string with specific number of decimal places in Python polars?

2 Answers2