Simple calculation on table. Please help me to make my code more effective

Question

Please help me to make my code more effective. This is my df:

df = pd.DataFrame([['A', 80], ['A', 64], ['A', 55], ['B', 56], ['B', 89], ['B', 73], ['C', 78], ['C', 100], ['C', 150], ['C', 76], ['C', 87]], columns=['Well', 'GR'])

Well    GR
A       80
A       64
A       55
B       56
B       89
B       73
C       78
C       100
C       150
C       76
C       87

Please help me to find the Vshale. Vshale on each well = GR - GR(min) / GR(max) - GR(min). This is my desired result:


Well    GR      Vshale
A       80      1
A       64      0.36
A       55      0
B       56      0
B       89      1
B       73      0.515151515
C       78      0.027027027
C       100     0.324324324
C       150     1
C       76      0
C       87      0.148648649

This code is work for me, but, I should create a new column that consists of GRMax and GRMin and merge it into my previous df. I am looking for a more effective way without adding GRmin and GRmax on my original df. Thank you.

df1 = df.groupby(['Well']).agg({'GR': ['min', 'max']}).reset_index()
df1.columns = list(map(''.join, df1.columns.values))
df2 = pd.merge (df, df1, on = 'Well', how = 'left')
df2['Vshale'] = (df2['GR'] - df2['GRmin'])/(df2['GRmax'] - df2['GRmin'])

Can you make the dataframe reproducible, rather than just posting the values? https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples — william3031, Dec 30 '20 at 04:23
Yes, pd.DataFrame([['A', 80], ['A', 64], ['A', 55], ['B', 56], ['B', 89], ['B', 73], ['C', 78], ['C', 100], ['C', 150], ['C', 76], ['C', 87]], columns=['Well', 'GR']) — elli, Dec 30 '20 at 04:39

score 1 · Accepted Answer · answered Dec 30 '20 at 07:44

1

one string solution with method transform:

df['Vshale'] = df.groupby('Well').transform(lambda x: (x - np.min(x))/(np.max(x) - np.min(x)))

answered Dec 30 '20 at 07:44

Alexander S

71
3

What if I have more than 2 columns? – elli Dec 30 '20 at 09:03
@elli, two columns for what - for groupby or aggregation? Can you provide an example? – Alexander S Dec 30 '20 at 11:21
Let's say I have columns "Well", "GR", and "Depth". I only want to do this calculation on the "GR" column. Because I still don't understand the meaning of lambda x – elli Dec 30 '20 at 16:31

Suhas Mucherla · Answer 2 · 2020-12-30T06:28:49.270

try this:

df=pd.DataFrame([['A', 80], ['A', 64], ['A', 55], ['B', 56], ['B', 89], ['B', 73], ['C', 78], ['C', 100], ['C', 150], ['C', 76], ['C', 87]], columns=['Well', 'GR'])

a=[]
for i in np.unique(df['Well']):
    x=pd.DataFrame(df['GR']-df['GR'][df['Well']==i].min())/(df['GR'][df['Well']==i].max()-df['GR'][df['Well']==i].min() )
    x['Well']=i
    x=x[df['Well']==i]
    a.append(x)
gr=pd.concat(a)
df['Vshale']=gr['GR']

output:

df
Out[53]: 
   Well   GR    Vshale
0     A   80  1.000000
1     A   64  0.360000
2     A   55  0.000000
3     B   56  0.000000
4     B   89  1.000000
5     B   73  0.515152
6     C   78  0.027027
7     C  100  0.324324
8     C  150  1.000000
9     C   76  0.000000
10    C   87  0.148649

Explanation:

Create a empty list a
Use for loop to achieve the logic for all the unique values of Well
Apply the formula and store the results in x ( Note that everytime you iterate over this loop, the formula is applied to all the values of Wells,Hence the next steps)
By indexing,reduce x to have only the values for that particular Well value.
append this x to the initialized list a
Now just concat the dataframes in the list and assign these values to the original dataframe

Thank you very much. Could you explain to me this code? a=[] for i in np.unique(df['Well']): x=pd.DataFrame(df['GR']-df['GR'][df['Well']==i].min())/(df['GR'][df['Well']==i].max()-df['GR'][df['Well']==i].min() ) — elli, Dec 30 '20 at 05:49
@elli I've edited the answer to have the explanation. cheers :) — Suhas Mucherla, Dec 30 '20 at 06:29

Simple calculation on table. Please help me to make my code more effective

2 Answers2