My data set is Churn_Modeling:
I am looking to create a column called c_rating with the following ranges: (<500 -="very poor", 500-600="poor", 601-660="fair", 661-780="good", and >= 780 – "excellent").
Some example data: with columns in order:
RowNumber CustomerId Surname CreditScore Geography Gender Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary Exited
1 15634602 Hargrave 619 France Female 42 2 0 1 1 1 101348.88 1
2 15647311 Hill 608 Spain Female 41 1 83807.86 1 0 1 112542.58 0
3 15619304 Onio 502 France Female 42 8 159660.8 3 1 0 113931.57 1
4 15701354 Boni 699 France Female 39 1 0 2 0 0 93826.63 0
5 15737888 Mitchell 850 Spain Female 43 2 125510.82 1 1 1 79084.1 0
6 15574012 Chu 645 Spain Male 44 8 113755.78 2 1 0 149756.71 1
I am working on other code so my library is as follows:
from plotnine import *
from dfply import *
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
a_churn = pd.read_csv("Churn_Modeling.csv")
How can I do a case_when (like in R) but python to create this column?