Questions tagged [pandas-apply]

Applies Python functions to rows or columns of a pandas dataframe, which may or may not result in aggregation.

Pandas apply available in DataFrame and Series classes is the equivalent of map in many functional languages like Haskell or Scala. It calls the function given in the argument for each element/row/column (depending on other parameters).

More detailed documentation can be found in:

170 questions
58
votes
2 answers

Returning multiple values from pandas apply on a DataFrame

I'm using a Pandas DataFrame to do a row-wise t-test as per this example: import numpy as np import pandas as pd df = pd.DataFrame(np.log2(np.randn(1000, 4), columns=["a", "b", "c", "d"]).dropna() Now, suppose I have "a" and "b" as one group,…
Einar
  • 4,727
  • 7
  • 49
  • 64
12
votes
2 answers

pandas df.apply unexpectedly changes dataframe inplace

From my understanding, pandas.DataFrame.apply does not apply changes inplace and we should use its return object to persist any changes. However, I've found the following inconsistent behavior: Let's apply a dummy function for the sake of ensuring…
Pedro Fialho
  • 123
  • 1
  • 6
6
votes
4 answers

python pandas groupby/apply: what exactly is passed to the apply function?

Python newbie here. I'm trying to understand how the pandas groupby and apply methods work. I found this simple example, which I paste below: import pandas as pd ipl_data = {'Team': ['Riders', 'Riders', 'Devils', 'Devils', 'Kings', 'kings',…
linuxfever
  • 3,763
  • 2
  • 19
  • 43
6
votes
3 answers

Apply function to create string with multiple columns as argument

I have a dataframe like this: name . size . type . av_size_type 0 John . 23 . Qapra' . 22 1 Dan . 21 . nuk'neH . 12 2 Monica . 12 . kahless . 15 I want to create a new column with a…
aabujamra
  • 4,494
  • 13
  • 51
  • 101
6
votes
2 answers

pandas groupby apply on multiple columns to generate a new column

I like to generate a new column in pandas dataframe using groupby-apply. For example, I have a dataframe: df = pd.DataFrame({'A':[1,2,3,4],'B':['A','B','A','B'],'C':[0,0,1,1]}) and try to generate a new column 'D' by groupby-apply. This works: df…
Jongmmm
  • 148
  • 1
  • 6
5
votes
1 answer

Why does pandas.GroupBy.apply() ignore the sort flag in some situations?

When and why is the sort flag of a DataFrame grouping ignored in pd.GroupBy.apply()? The problem is best understood with an example. In the following 4 equivalent solutions to a dummy problem, approaches 1 and 4 observe the sort flag, while…
normanius
  • 8,629
  • 7
  • 53
  • 83
5
votes
1 answer

Improve performances (vectorize?) pandas.groupby.aggregate

I'm trying to improve the performances of a pandas.groupby.aggregate operation using a custom aggregating function. I noticed that - correct me if I'm wrong - pandas calls the aggregating function on each block in sequence (I suspect it to be a…
Luca
  • 1,610
  • 1
  • 19
  • 30
5
votes
1 answer

Using a dataframe to format the style of another dataframe

I have one pandas dataframe that I want to style the format based on the values of another dataframe of the same shape/size. I'm trying to use applymap. Here's an example: t1= pd.DataFrame({'x':['A','B','C'], 'y':['C','B','D']}) t2=…
Den Thap
  • 153
  • 5
4
votes
3 answers

Pandas: custom WMAPE function aggregation function to multiple columns without for-loop?

Objective: group pandas dataframe using a custom WMAPE (Weighted Mean Absolute Percent Error) function on multiple forecast columns and one actual data column, without for-loop. I know a for-loop & merges of output dataframes will do the trick. I…
Inder Jalli
  • 119
  • 2
  • 10
4
votes
2 answers

Pandas apply & map to every element of every column

How to apply a custom function to every element of every column if its the value is not null? Lets say I have a data frame of 10 columns, out of which I want to apply a lower() function to every element of just 4 columns if pd.notnull(x), else just…
ds_user
  • 2,139
  • 4
  • 36
  • 71
3
votes
1 answer

Using pandas groupby and apply for cumulative integration

I have a pandas DataFrame with columns idx, grp, X, Y, and I want to get a new column with the cumulative integral of a function of Y with respect to X. However, I want to apply this cumulative integration to each subgroup of the DataFrame as…
konstanze
  • 511
  • 3
  • 12
3
votes
1 answer

Koalas GroupBy > Apply > Lambda > Series

I am trying to port some code from Pandas to Koalas to take advantage of Spark's distributed processing. I am taking a dataframe and grouping it on A and B and then applying a series of functions to populate the columns of the new dataframe. Here is…
3
votes
2 answers

Create columns with .apply() Pandas with strings

I have a Dataframe df. One of the columns is named Adress and contains a string. I have created a function processing(string) which takes as argument a string a returns a part of this string. I succeeded to apply the function to df and create a new…
Basile
  • 575
  • 1
  • 6
  • 13
3
votes
5 answers

Is there a Pandas solution—e.g.: with numba, or Cython—to `transform`/`apply` with an index, a MultiIndexed DataFrame?

Is there a Pandas solution—e.g.: with numba, or Cython—to transform/apply with an index? I know I could use iterrows, itertuples, iteritems or items. But what I want to do should be trivial to vectorize… I've built a simple proxy to my actual…
A T
  • 13,008
  • 21
  • 97
  • 158
3
votes
1 answer

pandas apply when cells contain lists

I have a DataFrame where one column contains lists as cell contents, something like following: import pandas as pd df = pd.DataFrame({ 'col_lists': [[1, 2, 3], [5]], 'col_normal': [8, 9] }) >>> df col_lists col_normal 0 [1, 2, 3] …
pieca
  • 2,463
  • 1
  • 16
  • 34
1
2 3
11 12