I want to import a file and create two additional columns right after I import the file:
The file I am importing has the following structure:
index | probability_model |
---|---|
1 | 0.34 |
2 | 0.03 |
3 | 0.14 |
4 | 0.23 |
The following code works, but I'm trying to avoid it:
df = pd.read_csv(filename)
df['subgroups'] = df['probability_model'].transform(lambda x: pd.qcut(x, 100, duplicates='drop',labels=range(1,101)))
df['groups'] = df['subgroups'].apply(lambda x: 'high' if x>100 else 'medium' if 100>=x>50 else 'low' )
What I would like to do is something like the following. The first assign works well but the second throws an error.
df = pd.read_csv(filename)\
.assign(subgroups = lambda x: pd.qcut(x.probability_model, 100, duplicates='drop',labels=range(1,101)))\
.assign(groups = subgroups.apply(lambda x: 'high' if x>100 else 'medium' if 100>=x>50 else 'low')