1

I am new to loops and I'm struggling to format my outputs. I am trying to modify a time series of species abundance at various time points (i), by various magnitudes (j) for multiple species (k). For each of these options I want a column which shows abundance over time.

I have written a loop that works when I manually input various values of i,j and k (i.e. I get single column that has the correct values), but I can't figure out how to correctly index the output matrix.

A dummy data set looks like this (where x and y are different species and sample number is time):

dat<- data.frame(x = sample(1:100, 100, replace = TRUE), y = sample(1:100, 100, replace = TRUE))

For my loop I also create some other objects:

length <- nrow(dat)
change <- as.matrix(seq(0.0,0.99,0.2))
change.length <- nrow(change)

and a final matrix to populate (which has the correct dimensions). The -40 is there because I'm not modifying the first 20 or the last 20 abundances in the time series

final_matrix <- array(0,c(length,(((length-40)*change.length)*2))) # 2 is the number of species in this example

The loop looks like this:

for(i in 1:(length-40)) {
  for(j in 1:change.length){
    for(k in 1:2){
timestep1 <- dat[0:(19+(i)),k]   # selecting rows that will not be modified based on min + i for a given species k
timestep2 <- dat[(20+i):(length),k] # selecting rows that will be modified for any given species k (columns)

 result1 <- timestep1*1 # not making any changes to the abundance data
 result2 <- timestep2*change[j,1] # multiplying abundance by change (j)

 resultloop<-c(result1,result2) # binding the two matrices into a single column with 100 rows

 final_matrix[,i*j*k] <- resultloop
}}}

The final matrix indexed as it is, produces the incorrect number of columns with weird results such as columns with all zeros.

How can I index this matrix so that each column (with 100 rows representing abundances over time) is indexed for each value of i,j and k?

EDIT: An example (simulated) dummy data set to illustrate what I would like the output to look like:

dat<- data.frame(Spx_I1_j1 = sample(1:100, 100, replace = TRUE),
             Spx_I2_j1 = sample(1:100, 100, replace = TRUE),
             Spx_I3_j1 = sample(1:100, 100, replace = TRUE),
             # etc...
             Spx_I1_k2 = sample(1:100, 100, replace = TRUE),
             Spx_I2_k2 = sample(1:100, 100, replace = TRUE),
             #etc...
             Spy_I1_k1 = sample(1:100, 100, replace = TRUE),
             Spy_I2_k1 = sample(1:100, 100, replace = TRUE))

Where each column of a 100 abundance values is represented by individual columns for each modified time step sequence (i) for each magnitude change (j) for each species (k).

Many thanks in advance for any help or advice you may have on this matter.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Fabrice
  • 13
  • 4
  • Please show a sample of the expected output – Jean Sep 06 '17 at 08:57
  • thank you for your comment @waterling . I hope the dummy data set provided adequately illustrates how I would like the final output to look like. Please let me know if you need any further information. – Fabrice Sep 06 '17 at 09:14
  • You are putting your result into a column `i*j*k`. Add `print(i*j*k)` right after `for(k in 1:2){` and see into which column the result is suppose to be written. Is this really the correct column? You can also insert `browser()` inside the last `for` loop to explore what is going on. – Roman Luštrik Sep 06 '17 at 09:40
  • Thanks @Roman. In my question I have put the outputs in columns `i*j*k` but I think this is completely wrong. My question is, does someone have a solution that would allow me to store all the iterations in columns by indexing for all my loops? – Fabrice Sep 06 '17 at 09:57
  • Not really sure I understand your problem, but may I suggest sth like `array(dim = c(2, length, length - 40, change.length), dimnames = list( species = c("x", "y"), sample_number = NULL, time_point = NULL, magnitude = NULL ))` (multi-dimensional array) as a more appropriate data structure? – Aurèle Sep 06 '17 at 10:16

1 Answers1

0

I'm not very sure what you're asking. This is my interpretation. Btw please look at How to make a great R reproducible example? to reformulate your questions in future.

#Make some sample data, 10 observations, ignore the first 2 and last 2

ss<-function(x,n){set.seed(x); sample(1:100,n,replace=TRUE)}
dat<-as.matrix(data.frame("Dog"=ss(1,10),"Cat"=ss(2,10))) #Note, make this a matrix!
ind_ignore<-c(1:2,9:10) #these will be ignored, like your first and last x abundances
change <- seq(0.0,0.99,0.2) #5 values, so the expected output is expected to have a dimension of 10 obs by (2 species * 5 changes)
result <- matrix(NA,ncol=ncol(dat)*length(change), nrow=nrow(dat))
colnames(result)<-paste0("V",1:ncol(result))
n_counter<-ncol(dat)-1
counter<-1

> change
[1] 0.0 0.2 0.4 0.6 0.8
> dat
      Dog Cat
 [1,]  27  19
 [2,]  38  71
 [3,]  58  58
 [4,]  91  17
 [5,]  21  95
 [6,]  90  95
 [7,]  95  13
 [8,]  67  84
 [9,]  63  47
[10,]   7  55

Then I'm just going to assign the dat values to the result matrix, and change the colnames.

for(magnitude in change){
  #Indexes of columns in result that should be modified
  ind_col<-counter:(counter+n_counter)

  #Change names
  colnames(result)[ind_col]<-paste0(colnames(dat),"_",magnitude)

  #Apply the change and assign it to the result
  result[-ind_ignore,ind_col]<-dat[-ind_ignore,]*magnitude
  result[ind_ignore,ind_col]<-dat[ind_ignore,]

  counter<-counter+ncol(dat)
}

Output

> result
      Dog_0 Cat_0 Dog_0.2 Cat_0.2 Dog_0.4 Cat_0.4 Dog_0.6 Cat_0.6 Dog_0.8 Cat_0.8
 [1,]    27    19    27.0    19.0    27.0    19.0    27.0    19.0    27.0    19.0
 [2,]    38    71    38.0    71.0    38.0    71.0    38.0    71.0    38.0    71.0
 [3,]     0     0    11.6    11.6    23.2    23.2    34.8    34.8    46.4    46.4
 [4,]     0     0    18.2     3.4    36.4     6.8    54.6    10.2    72.8    13.6
 [5,]     0     0     4.2    19.0     8.4    38.0    12.6    57.0    16.8    76.0
 [6,]     0     0    18.0    19.0    36.0    38.0    54.0    57.0    72.0    76.0
 [7,]     0     0    19.0     2.6    38.0     5.2    57.0     7.8    76.0    10.4
 [8,]     0     0    13.4    16.8    26.8    33.6    40.2    50.4    53.6    67.2
 [9,]    63    47    63.0    47.0    63.0    47.0    63.0    47.0    63.0    47.0
[10,]     7    55     7.0    55.0     7.0    55.0     7.0    55.0     7.0    55.0

Edits

Again I don't know if I understood the comments correctly, here goes nothing:

ss<-function(x,n){set.seed(x); sample(1:100,n,replace=TRUE)}
dat<-as.matrix(data.frame("Dog"=ss(1,10),"Cat"=ss(2,10))) #Note, make this a matrix!
ignore<-2 #these will be ignored, like your first and last x abundances
change <- seq(0.0,0.99,0.2) #5 values, so the expected output is expected to have a dimension of 10 obs by (2 species * 5 changes * (10-4 ignored points) timepoints)
n_timepoints<-nrow(dat) - ignore*2
result <- matrix(NA,ncol=ncol(dat)*length(change)*n_timepoints, nrow=nrow(dat))
colnames(result)<-paste0("V",1:ncol(result))
n_counter<-ncol(dat)*n_timepoints-1
counter<-1

for(magnitude in change){
  #Indexes of columns in result that should be modified
  ind_col<-counter:(counter+n_counter)

  #Change names
  colnames(result)[ind_col]<-paste0(colnames(dat),"_",magnitude,"_",rep(1:n_timepoints, each=ncol(dat)))

  for (timepoint in 1:n_timepoints){
    #Apply the change and assign it to the result
    ind_col_timepoint <- counter:(counter+ncol(dat)-1)
    ind_ignore_timepoint<-c(1:(ignore+timepoint-1), (nrow(dat)-ignore+1):nrow(dat))
    result[-ind_ignore_timepoint,ind_col_timepoint]<-dat[-ind_ignore_timepoint,]*magnitude
    result[ind_ignore_timepoint,ind_col_timepoint]<-dat[ind_ignore_timepoint,]
    counter<-counter+ncol(dat)
  }

}
Jean
  • 1,480
  • 15
  • 27
  • Thank you for this! I apologize for my clunky question (first time posting). Your solution is almost perfect. BUT one of the main aspects I need in the code is that the change occurs at different times (noted as `i` in my example). e.g. Dog_0 should have 6 columns with the change ocuring in rows 3:8; then 4:8; then 5:8 etc. Is there a way to integrate this to the code you've provided? – Fabrice Sep 06 '17 at 10:35
  • I don't understand it. I thought `i` referred to each row – Jean Sep 06 '17 at 10:39
  • You are right `i` refers to each row. but I want to create the changes at different times. so you column Dog_0 is correct for `i` =1 (e.g. change occurs at times 3:8) but I also then want another column for `i` = 2 (e.g. change occurs at times 4:8) and `i` = 3 (change at 5:8), etc. – Fabrice Sep 06 '17 at 10:49
  • Amazing - this worked. I don't fully understand the code just yet but I'll get there. Thank you very much for your time and patience! – Fabrice Sep 06 '17 at 11:14