-3

I have to make a 4*4 Kohonen map for a project.

However, I get the error

Error in win_index[1, ] : subscript out of bounds

In addition: There were 16 warnings (use warnings() to see them)

After testing my code, I guess it's the lapply function (line 77) that doesn't is not executed correctly because the matrix thus created contains only NA. And so since this matrix is used later on, the result is not correct because NA is present throughout the program.

#############################################
##Function for distance calculation (RMSDA)##
#############################################

RMSDA<-function(data_phipsi,Kohonen_matrix)
{
  difference<-data_phipsi-Kohonen_matrix 
  for(j in 1:length(difference)){
    if (difference[j]< -180) {difference[j]=difference[j]+360}
    if (difference[j]> +180) {difference[j]=difference[j]-360}
  }
  distance=mean(sqrt(difference^2))  
  return(distance) 
}

##############################
## Program ###
##############################

for(step in 1:iteration)
{
  data_phipsi<-data_phipsi[sample(nrow(data_phipsi)),] # Sample vectors of training (samples of lines of the dataframe)
  print(step) #Visualize where we are in loops
  for(k_row in 1:nrow(data_phipsi))
  {
    #Update learn_rate and radius at each row of each iteration
    learn_rate<-learning(initial_rate,((step-1)*nrow(data_phipsi))+k_row,data_phipsi)
    learn_radius<-learning(initial_rate,((step-1)*nrow(data_phipsi))+k_row,data_phipsi)
    #Find distance between each vectors of angles of Kohonen Map and the training vector
    phipsi_RMSDA<-lapply(random_list, RMSDA, data_phipsi=data_phipsi[k_row,])
}

How is it possible to fix this lapply error?

Thank you. Thank you.

edit : These are the only useful elements for the lapply

For the random list we can use:

random_list <- list(
  c(88, 148, 60, 83, -119, -59, -96, 169),
  c(104, 101, 174, -48, 18, 10, -159, 158),
  c(164, -80, 137, -170, -172, 52, -149, 96),
  c(88, 18, -115, 48, -3, -158, -92, -154),
  c(170, -107, -109, -14, -142, -77, -120, 76),
  c(-121, 15, -46, -145, -128, 74, -166, 44),
  c(46, -178, 67, -88, -125, -130, 88, -11),
  c(131, 147, -32, 103, -16, 116, 78, -125),
  c(75, -95, -137, 133, -97, -134, 126, -105),
  c(115, 173, -82, -135, 134, 82, -143, -43),
  c(111, 13, -54, -53, 103, 132, -13, -43),
  c(-143, 89, -91, -137, -63, 14, -166, 83),
  c(-98, 178, 14, -80, -122, -25, 19, 117),
  c(-113, -97, 34, -178, -56, 18, -167, 84),
  c(49, 82, 50, 168, -157, -154, 51, 78),
  c(173, -4, 164, 125, 31, 115, -74, -92)
)

and a exemple for data_phipsi[1,]

data_phipsi <- read.table(header = TRUE, text = "
 phi1 psi2 phy2 psi3 phy3 psi4 phy4 psi5  
-24.5 81.9 -155.2 -81.4 127.7 -118 166 -82.1")
data_phipsi
#    phi1 psi2   phy2  psi3  phy3 psi4 phy4  psi5
# 1 -24.5 81.9 -155.2 -81.4 127.7 -118  166 -82.1
Community
  • 1
  • 1
irishupk
  • 1
  • 3
  • Please try to reduce this question and make it reproducible. Do we really need all 76 lines of code before the single `lapply`, or do you think you could give us samples of `random_list` and `data_phipsi[k_row,]`, and let us work from there? Making a question easily approachable makes it much more likely that you'll get proposed answers. Some useful links for question quality and reproducibility: https://stackoverflow.com/questions/5963269, https://stackoverflow.com/help/mcve, and https://stackoverflow.com/tags/r/info. – r2evans Feb 16 '19 at 19:16
  • I boiled down your example into something that can be easily copy/pasted into an R console (see https://stackoverflow.com/editing-help for more formatting help). – r2evans Feb 16 '19 at 19:39
  • With the sample data, I can get `Map(RMSDA, list(as.numeric(data_phipsi[k_row,,drop=TRUE])), random_list)` to work (in place of your `lapply`). Your error is about `win_index`, when you troubleshoot your code does it look like it is supposed to? – r2evans Feb 16 '19 at 19:41
  • @r2evans No, it doesn't change anything there's always the mistake. I know that the error is on win_index but as it requires a variable that needs another one that itself needs the one with the lapply, I thought this is where there was a problem – irishupk Feb 16 '19 at 20:16
  • As I've dug into it, it will be impossible for us to find the true error without having all or at least the culprit data. I suspect the problem is that your `matrix_RMSDA` either (a) has an `NA`, in which case using `min(..., na.rm=TRUE)` might help; or perhaps (b) something else is messing up the construction of the matrix. When you get the error, are there any `NA`s in `matrix_RMSDA`? – r2evans Feb 16 '19 at 20:37
  • @r2evans The matrix_RMSDA is composed only of NA. > matrix_RMSDA [,1] [,2] [,3] [,4] [1,] NA NA NA NA [2,] NA NA NA NA [3,] NA NA NA NA [4,] NA NA NA NA – irishupk Feb 16 '19 at 20:46
  • That's what I thought. When you run the code and get the error, add the output from `dput(random_list)` and `dput(data_phipsi)` to your question, because as you suspect it seems like that the `lapply` output is not what you expect. – r2evans Feb 16 '19 at 21:15
  • @r2evans Now, the warnings say that : 1: In mean.default(difference^2) : l'argument n'est ni numérique, ni logique : renvoi de NA (the argument is neither numerical nor logical: NA return). And your two command return dput(random_list) : list( c(88, 148, 60, 83, -119, -59, -96, 169), c(104, 101, 174, -48, 18, 10, -159, 158), c(164, -80, 137, -170, -172, 52, -149, 96), c(88, 18, -115, 48, -3, -158, -92, -154), c(170, -107, -109, -14, -142, -77, -120, 76), c(-121, 15, -46, -145, -128, 74, -166, 44) and dput(data_phipsi) return all my original file – irishupk Feb 16 '19 at 21:50
  • `dput(data_phipsi[k_row,])` then, as I cannot help if I don't have your numbers – r2evans Feb 16 '19 at 22:01
  • the command return structure(list(phi1 = -26.4, psi2 = 68, phy2 = 15.4, psi3 = -123.5, phy3 = 144.6, psi4 = -65.7, phy4 = 143.4, psi5 = -72), row.names = 63607L, class = "data.frame") – irishupk Feb 16 '19 at 22:10
  • What do you expect to happen with `sqrt(difference^2)`? It's a functional no-op unless you really just need `abs(difference)`. Try changing your `RMSDA` function to have the line `distance=mean(abs(unlist(difference)))`. – r2evans Feb 16 '19 at 22:16
  • sqrt(difference^2) was given to us by our teacher. This allows you to find the value that will allow you to modify each vector. When I change the line with your proposal, an infinite loop is created. – irishupk Feb 16 '19 at 22:30
  • To me `sqrt(x^2)` is mathematically equivalent to `abs(x)` for all real `x`. Are you ever looking at *complex* numbers? Perhaps there was something else in what the teacher was saying or meaning, but that is (if not *complex*) a waste of computation. (Even in matrix terms, it's the same, so it isn't a matrix outer product thing.) Think about it and try it: is there any real number where if you square it then take the square root, you do not get the absolutely value of the original number? – r2evans Feb 16 '19 at 22:38
  • I should add (after a mini-debate with my spouse) that *algebraically* the square root of a number is described as "plus or minus something". However, *computationally* (though not symbolically) I do not know of a system/language that gives you anything other than the positive root. – r2evans Feb 16 '19 at 22:43
  • 1
    Hello, I thank you very much for the time you have given me and for your precious help. My file being very (too large) more than 600,000 lines, the computation time is huge. So I decided to do only 30 lines and the calculation is much faster and gives me the desired results. Thank you very much for your help! – irishupk Feb 16 '19 at 22:58
  • Are you saying that `abs(...)` in place of `sqrt(...^2)` is faster, or the part about `mean(abs(unlist(difference)))`? If so, I'll write this up as an answer and you can "accept" it to close the question out. If not ... let me know what else is not working. – r2evans Feb 16 '19 at 23:10
  • Since I modified distance=sqrt(mean(difference^2)) by your proposal distance=mean(abs(unlist(difference))), I no longer have any errors and the program shows me the expected results. Thank you. Thank you. – irishupk Feb 16 '19 at 23:16

1 Answers1

0

The main issue was that a follow-on command (check back in the question edits for the win_index line of code, if interested) was failing because the results from lapply were suspect.

Troubleshooting that found the problem:

  • when the error hits, the value of both random_list and data_phipsi[k_row,] were reasonable and not suspect;
  • however, the output from lapply was all NA

So diving into the function RMSDA, it was noticed that difference looked fine up until

distance=mean(sqrt(difference^2))  

at which point it turned all-NA. The problem is that the data was still in a form of a frame, so mean was failing (just try mean(mtcars) to see). (It should also be noted that I believe sqrt(difference^2) is equivalent to abs(difference).)

Replacing that line of RMSDA with

distance=mean(abs(unlist(difference)))

seems to have fixed the problem (the critical part being unlist).

r2evans
  • 141,215
  • 6
  • 77
  • 149