-1

I have the following data frame:

enter image description here

For each entry of this data frame, I am looking to get the sum of log(df['s_i']/df['s_18'])*log(i/18) for i in h = [18, 56, 98, 123, 148].

I tried the following:

a = []
h = [18, 56, 98, 123, 148]

for i in df.index:
  for zi in h:
    a.append(log(df.loc[i,'s_'+str(zi)]/df.loc[i,'s_18'])*log(zi/18))
    b = sum(a)

However, it did not work. Any idea how to solve this problem? Any help would be greatly appreciated!

nalfahel
  • 145
  • 9
  • 1
    Please provide the expected [MRE - Minimal, Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example). Show where the intermediate results deviate from the ones you expect. We should be able to paste a single block of your code into file, run it, and reproduce your problem. This also lets us test any suggestions in your context. – Prune Mar 27 '21 at 04:01
  • 1
    We expect a working example of the problem, including appropriate code to trace the internal operation .[Include your minimal data frame](https://stackoverflow.com/questions/52413246/how-to-provide-a-reproducible-copy-of-your-dataframe-with-to-clipboard) as part of the example. – Prune Mar 27 '21 at 04:01
  • [Why not upload images of code/errors when asking a question?](https://meta.stackoverflow.com/questions/285551/why-not-upload-images-of-code-errors-when-asking-a-question) – anky Mar 27 '21 at 04:12

1 Answers1

1

Here one thing to note use np.log<i> or math.log where is <i> base what ever you want if you don't have any custom implementation log function.
zr should to change to 18 as per your statement formula

log(df['s_i']/df['s_18'])*log(i/18) 

Here i am using similar layout dataframe, please check out this code:

import numpy as np
import pandas as pd

h = [18, 56, 98, 123, 148]

df = pd.DataFrame(
    data= np.arange(1, 51).reshape(10, 5),
    columns= ['s_'+str(i) for i in h]
)

# here one point should be noted that `b += sum(a)` should be in outer `for-loop` 
# otherwise as `a` grows at each iteration its `all items again added` so put
# `b += sum(a) in outer for-loop`

b = 0
for i in df.index:
    a = []
    for zi in h:
        a.append(np.log(df.loc[i,'s_'+str(zi)]/df.loc[i,'s_18'])*np.log(zi/18))
    b += sum(a)

print(b)

Method-2

equation = "np.log(df.loc[i,'s_'+str(j)]/df.loc[i,'s_18'])*np.log(j/18)"
b = 0
for i in df.index:
    b += sum(list(map(lambda j: eval(equation), h)))

print(b)
Davinder Singh
  • 2,060
  • 2
  • 7
  • 22
  • thank you for your solution! It is helpful but doesn't get me what I want... For each entry in my data frame I would like to get an array of 5 elements (each of them being log(df['s_i']/df['s_18'])*log(i/18) for each of five columns) and then sum these elements. – nalfahel Mar 27 '21 at 04:21
  • It is still not working. I get an array of 5*10 for a. What I would like is 10 versions of a 5-element array – nalfahel Mar 27 '21 at 04:27
  • Still not working.... When I use this code:```a = [] h = [18, 56, 98, 123, 148] for i in df.index: a = [log(df.loc[i,'s_18']/df.loc[i,'s_18'])*log(18/18), log(df.loc[i,'s_56']/df.loc[i,'s_18'])*log(56/18), log(df.loc[i,'s_98']/df.loc[i,'s_18'])*log(98/18), log(df.loc[i,'s_123']/df.loc[i,'s_18'])*log(123/18), log(df.loc[i,'s_148']/df.loc[i,'s_18'])*log(148/18)] ``` – nalfahel Mar 27 '21 at 04:34
  • I get: [0.0, 0.547444977693881, 1.2672118256367033, 1.5849306944206611, 1.8022843414442449] [0.0, 0.48861544353222236, 1.0944838110776416, 1.3968318275153528, 1.650351644866753] ... – nalfahel Mar 27 '21 at 04:37
  • and this is what I want. Then I can just sum up the elements of each array before getting the next array. I am just looking for a nicer way to get this – nalfahel Mar 27 '21 at 04:37
  • Thank you for the google doc! Yes, that code works. my issue is that I would like to find a nicer (and shorter way to define my a array. Sorry if I wasn't clear. – nalfahel Mar 27 '21 at 04:47
  • Awesome! Method 2 works perfect! Thank you very much! (b=sum() not b+=sum()) – nalfahel Mar 27 '21 at 05:26
  • @nalfahel please tell me one thing if we use `b = sum()` then all sum will not take place, rather first it calculate sum of `first row` and assign to `b` and in next loop it will sum `second row` and update `b` value rather add to previous first row sum. and goes on, finally we assign last row sum to `b` – Davinder Singh Mar 27 '21 at 05:42
  • Yes! this is what I want! – nalfahel Mar 27 '21 at 14:43