Creating a new column in a DataFrame based on multiple condition using values from another column in Python

Question

I would like to create a new column base on conditions from the Spread column. The logic is:

if Spread is <= 4 then New_column = 4

if Spread is >4 and <=8 then New_column = 8

if Spread is >8 and <=12 Then new New_column = 12

if Spread is >12 Then new New_column = 16 (16 is the cut off.)

I have tried using .where but I have not any luck. I would like the final output to look like below

Spread	New_Column
1	4
2	4
5	8
6	8
9	12
11	12
13	16

Does this answer your question? [Creating a new column based on if-elif-else condition](https://stackoverflow.com/questions/21702342/creating-a-new-column-based-on-if-elif-else-condition) — Savir, Mar 27 '23 at 23:26

Andrej Kesely · Answer 1 · 2023-03-28T00:24:45.760

1

Perhaps you can use pd.cut:

df['New_Column_2'] = (pd.cut(df['Spread'], [0, 4, 8, 12, np.inf]).cat.codes + 1) * 4
print(df)

Prints:

   Spread  New_Column  New_Column_2
0       1           4             4
1       2           4             4
2       5           8             8
3       6           8             8
4       9          12            12
5      11          12            12
6      13          16            16
7      24          16            16

edited Mar 28 '23 at 00:24

answered Mar 27 '23 at 23:50

Andrej Kesely

168,389
15
48
91

This partially worked. Only problem is I have in the spread column records > 16 and for those it looks like it's marked as 0 in the New_column. Would like for anything >16 To be marked as 16 in the new_column – Jay Mar 28 '23 at 00:09
1

Thanks this works. I also got this to work as well: df['New_Column_2'] = np.where(TW_df.Spread >16, 16, (pd.cut(TW_df['Spread_Ticks'], [0, 4, 8, 12, 16]).cat.codes + 1) * 4) – Jay Mar 28 '23 at 00:26
1

I just nested your original answer with np.where – Jay Mar 28 '23 at 00:28

score 1 · Answer 2 · answered Mar 28 '23 at 00:17

1

You can use apply and create a function:

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'Spread': [1,2,5,6,9,11,13]
})

def my_func(x): # Your function with your conditions
    if x <= 4:
        return 4
    elif x > 4 and x <= 8:
        return 8
    elif x > 8 and x <= 12:
        return 12
    elif x > 12:
        return 16
    else:
        return np.nan # In case none of your criteria is met

df['New_Column'] = df['Spread'].apply(my_func)

df is now:

   Spread  New_Column
0       1           4
1       2           4
2       5           8
3       6           8
4       9          12
5      11          12
6      13          16

answered Mar 28 '23 at 00:17

Marcelo Paco

2,732
4
9
26

Thanks, This worked also. I attempted this but i got a syntax error. With your example I see what I was doing wrong. – Jay Mar 28 '23 at 00:31
Yeah no problem. Just wanted to clarify that you were able to resolve the syntax error? If not, can you elaborate more on what the error was? – Marcelo Paco Mar 28 '23 at 00:42
1

Yes, I was able to resolve. When I tried this method prior to your answer I was getting an error due to a typo. I'm good to go now. Thanks – Jay Mar 28 '23 at 01:13

score 1 · Accepted Answer · answered Mar 28 '23 at 00:17

1

The equal spacing of your conditions allows for:

df = pd.DataFrame({'Spread':[1,2,5,6,9,11,13,20]})
df['NewCol'] = df['Spread'].apply(lambda x: min( 4*(1+(x//4)), 16) )

answered Mar 28 '23 at 00:17

vogelito

28
4

Creating a new column in a DataFrame based on multiple condition using values from another column in Python

3 Answers3