0

I have a list in R within which there is data (Data):

#create data
tmintest=array(1:100, c(12,256,512))

#create the list
Variable <- list(varName = c("tmin","tmin","tmin","tmin","tmin","tmin","tmin","tmin","tmin","tmin","tmin","tmin"),level = c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA))
Data     <- tmintest
xyCoords <- list(x = seq(-40.37,64.37,length.out=420), y = seq(25.37,72.37,length.out=189))
Dates <- list(start = seq(as.Date("2012-01-01"), as.Date("2015-12-31"), by="days"), end=seq(as.Date("2012-01-01"), as.Date("2015-12-31"), by="days"))
All <- list(Variable = Variable,Data=Data, xyCoords=xyCoords,Dates=Dates)

How can I find exactly where the maximum number in ALL$Data is occurring? For example, if it is the 4th row, 100th column on the first 'slice' or 'grid' I want back: [1,4,100].

I tried which.max(All$Data) but that just returns a single number?

matlabcat
  • 172
  • 2
  • 12

1 Answers1

1

which supports arr.ind, which returns the array indices of the condition. Unfortunately, there is no such argument for which.max, so we can compare the values against the max of the values.

head( which(All$Data == max(All$Data), arr.ind = TRUE) )
#      dim1 dim2 dim3
# [1,]    4    9    1
# [2,]    8   17    1
# [3,]   12   25    1
# [4,]    4   34    1
# [5,]    8   42    1
# [6,]   12   50    1

I'll be a bit cautious here: strict equality tests of floating-point numbers can be a problem when the precision involves a good number of decimal places. See Why are these numbers not equal?, Is floating point math broken?, and https://en.wikipedia.org/wiki/IEEE_754 for a good discussion of this.

The better test is one of strict inequality, looking for a tolerance. Here I'll use 1e-5 since we know that it is significantly smaller than the range of numbers (1 to 100), but if your real numbers have more precision, you may want something a little more suited for your needs.

head( which( (max(All$Data) - All$Data) < 1e-5, arr.ind = TRUE) )
#      dim1 dim2 dim3
# [1,]    4    9    1
# [2,]    8   17    1
# [3,]   12   25    1
# [4,]    4   34    1
# [5,]    8   42    1
# [6,]   12   50    1

Note that if you drive that tolerance value 1e-5 too low, you may start to lose values. It won't happen here (because your data is much larger).

r2evans
  • 141,215
  • 6
  • 77
  • 149
  • This is a guess, but I would assume the reason `which.max` does not include array indexes, is that there is some ambiguity as to which is the `max` value index. `which.max` only returns the very first index, but in multiple dimensions this index depends on the orientation of the array (which axis order are we looking at first, then second, then third and so on). You could likely get away with `sqrt(.Machine$double.eps)` for you upper boundary, which is commonly used in many other applications . – Oliver Sep 09 '21 at 20:39
  • That's a good estimate for the upper-bound, thanks. – r2evans Sep 09 '21 at 21:28
  • Thanks for that. Assuming `x<-head( which( (max(All$Data) - All$Data) < 1e-5, arr.ind = TRUE) )` , would it be possible to use these indices (x) to index values from another list with the same structure/size? i.e. First identify where a list's Data falls below a certain value, then use those indices to identify data in another array that you want to use as replacements for the original indexed list? – matlabcat Sep 14 '21 at 13:33
  • It's certainly possible, though I'm always hesitant to suggest using them as raw indices like that: if the row-order is changed or not assured for whatever reason, then the indices will still work but will be wrong. If the rows of both frames are tied together, then either (a) combine them into the same frame from the start, or (b) if they have unique "keys" that can be used to join them, consider a [merge/join](https://stackoverflow.com/q/1299871/3358272) operation. – r2evans Sep 14 '21 at 13:41
  • Thanks. I haven't used `merge/join` before, but I'll have a look. – matlabcat Sep 15 '21 at 19:31