9

How to add new feature like length of data frame & Drop rows value using indexing. I want to a add a new column where I can count the no-of rows available in a data frame, & using indexing drop rows value.

for i in range(len(df)):
    if (df['col1'][i] == df['col2'][i]) and (df['col4'][i] == df['col3'][i]):
        pass
    elif (df['col1'][i] == df['col3'][i]) and (df['col4'][i] == df['col2'][i]): 
        df['col1'][i] = df['col2'][i]
        df['col4'][i] = df['col3'][i]
    else:
       df = df.drop(i)
Hrushi
  • 409
  • 2
  • 10

1 Answers1

11

Polars doesn't allow much mutation and favors pure data handling. Meaning that you create a new DataFrame instead of modifying an existing one.

So it helps to think of the data you want to keep instead of the row you want to remove.

Below I have written an example that keeps all data except for the 2nd row. Note that the slice will be the fastest of the two and will have zero data copy.

df = pl.DataFrame({
    "a": [1, 2, 3],
    "b": [True, False, None]
}).with_row_count("row_nr")

print(df)

# filter on condition
df_a = df.filter(pl.col("row_nr") != 1)

# stack two slices
df_b = df[:1].vstack(df[2:])

# or via explicit slice syntax
df_b = df.slice(0, 1).vstack(df.slice(2, -1))

assert df_a.frame_equal(df_b)

print(df_a)

Outputs:

shape: (3, 3)
┌────────┬─────┬───────┐
│ row_nr ┆ a   ┆ b     │
│ ---    ┆ --- ┆ ---   │
│ u32    ┆ i64 ┆ bool  │
╞════════╪═════╪═══════╡
│ 0      ┆ 1   ┆ true  │
├╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 1      ┆ 2   ┆ false │
├╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 2      ┆ 3   ┆ null  │
└────────┴─────┴───────┘
shape: (2, 3)
┌────────┬─────┬──────┐
│ row_nr ┆ a   ┆ b    │
│ ---    ┆ --- ┆ ---  │
│ u32    ┆ i64 ┆ bool │
╞════════╪═════╪══════╡
│ 0      ┆ 1   ┆ true │
├╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 2      ┆ 3   ┆ null │
└────────┴─────┴──────┘

ritchie46
  • 10,405
  • 1
  • 24
  • 43