0

Here is a dataframe example:

ColA ColB ColC
Low 10 Tg
Mid 20 asd
High 30 mnr

if you want to work on it, here is a copy paste:

df = pd.DataFrame({
    'ColA':['Low','Mid','High'],
    'ColB':[10,20,30],
    'ColC':['Tg','asd','mnr']
})

What I want to do is, Create a function that returns a continuous value(ex. 1-2-3) depends on its value distribution on ColB.

Above example, ColA:Low has 10 in ColB , and ColA:Mid got 20.

def getlinear(x):
    return 0 if x=='Low' else 1 if x=='Mid' else 2

this function solves the problem and returns continuous values, but then, I need to create another function to apply for ColC. I want one function for all.

solac34
  • 156
  • 11
  • `df['ColB'].rank(method='dense')`? For strings, you can use a `Categorical` or use `factorize` – mozway Nov 29 '22 at 11:57

0 Answers0