1

I would like to perform following operation in Pandas:



library(tidyverse)

df <- tibble(mtcars)

df %>% 
  select(ends_with('t')) %>% 
  head(3)

# Accross all columns that ends with t, add value 100 hundred (+1 for true) if column contains 'at

df %>% 
  mutate(across(ends_with('t'), ~ . + 100 + str_detect(cur_column(), 'at'))) %>% 
  select(ends_with('t') )%>% 
  head(3) %>% view()

Is there any nice equivalent to it? Or at least some really nice one-liner using apply function in pandas?

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
Petr
  • 1,606
  • 2
  • 14
  • 39
  • In your R case, you should use `transmute` instead of piping `mutate` into `select`. In pandas, transmuting is relatively simple, things get cumbersome when you want to mutate across a subset of columns. – Olsgaard Aug 11 '22 at 08:00

3 Answers3

1

You could try (not sure if this qualifies for you as "nice")

# approach one
cols = [x for x in mtcars.columns if x.endswith("t")]

def f(x, cols):
    for col in cols:
        if "at" in col: 
            x[col] += 101
        else:
            x[col] += 100
    return x
mtcars.apply(f, args=(cols,), axis=1)[cols].head(3)

# approach two 
cols= [col for col in mtcars.columns if col.endswith("t")]
cols_w_at = [col for col in cols if "at" in col]
mtcars[cols] = mtcars[cols].apply(lambda x: x + 100)
mtcars[cols_w_at] = mtcars[cols_w_at].apply(lambda x: x + 1)
mtcars[cols].head(3)
1

You could unpack a dictionary comprehension into the keyword arguments of assign (assign is similar to dplyr::mutate).

import pandas as pd
from statsmodels.datasets import get_rdataset

mtcars = get_rdataset('mtcars').data

(mtcars
 .assign(**{col: mtcars[col].add(100).add('at' in col) 
            for col in mtcars.filter(regex='t$')})
.filter(regex='t$')
.head(3)
)

Output:

                 drat       wt
Mazda RX4      104.90  102.620
Mazda RX4 Wag  104.90  102.875
Datsun 710     104.85  102.320
Levi Baguley
  • 646
  • 1
  • 11
  • 18
  • Is it possible to refer to the dataset without using explicit its name in the parentheses of `.assign()`? For one variable it is possible to achieve by using `lambda` functions. And what about the batch assignment? My purpose is to change data types of many (~50) columns. – GegznaV Mar 08 '23 at 16:42
  • 1
    @GegznaV It's possible, but it's not pretty. You could put the assign bit into a pipe like `.pipe(lambda: df: df.assign(...))`. Rather than doing that, I would highly recommend using the `.astype()` method insead like [this](https://stackoverflow.com/questions/55833729/how-to-change-datatype-of-multiple-columns-in-pandas). – Levi Baguley Mar 08 '23 at 20:18
1

One option is a combination of filter and transform:

(mtcars
.filter(regex = "t$")
.transform(lambda df: np.where('at' in df.name, df + 101, df+100))
.head(3)
)
     drat       wt
0  104.90  102.620
1  104.90  102.875
2  104.85  102.320
sammywemmy
  • 27,093
  • 4
  • 17
  • 31