2

I have the daily temperatures of 4 countries as below:

Day GB US AU CA
1 33
2 32
3 23
4 34
5 23 34
6 45 34
7 34 23
8 34 33
9 34 45 44 12
10 23 23 23 23
11 34 23 45
12 12 23 44
13 34 66 53
14 34 34 33
15 43 34 22
16 45 56 34
17 34 32
18 22 12
19 12
20 12

I am trying to create a function that will return the nearest value to the left for any given day plus an offset. In other words, if you were to draw a hypothetical horizontal line from the left to the right for each row (where offset=0) then then the value returned should be the one that would be encountered first. i.e. an offset=1 will just return the 2nd intersection, and offset=2 will return the 3rd and so on.

def nearest(offset=0):
    #return df

so that for example:

nearest(0) will output:

Day Temp
1 33
2 32
3 23
4 34
5 23
6 45
7 34
8 34
9 34
10 23
11 34
12 12
13 34
14 34
15 43
16 45
17 34
18 22
19 34
20 12

and nearest(1):

Day Temp
1
2
3
4
5 34
6 34
7 23
8 33
9 45
10 23
11 23
12 23
13 66
14 34
15 34
16 56
17 32
18 12
19
20

Reality is that I have about 10 countries which are all staggered like the first table above and offset value can be as high as 9!

Any suggestions appreciated.

RebeccaKennedy
  • 121
  • 1
  • 10

1 Answers1

2

I'm assuming the empty cells are NaNs:

def nearest(df, offset=0):
    x = df.loc[:, "GB":].apply(lambda x: sorted(x, key=pd.isna)[offset], axis=1)
    return x


print(nearest(df, 1))

Prints:

0      NaN
1      NaN
2      NaN
3      NaN
4     34.0
5     34.0
6     23.0
7     33.0
8     45.0
9     23.0
10    23.0
11    23.0
12    66.0
13    34.0
14    34.0
15    56.0
16    32.0
17    12.0
18     NaN
19     NaN
dtype: float64
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91