0

I have two arrays "begin" and "end_a" which contain some integer indices, except that some of the entries in "end_a" are NA.

And panelDataset is a matrix which contains the data. I want to take the means of the rows of panelDataset corresponding to non-NA entries of begin and end_a.

I have this working in serial fashion and it works fine, but when I tried to vectorize it as follows

switch_mu=ifelse(!is.na(end_a),mean(panelDataset[begin: end_a,4]),NA)

It gives an error: Error in begin:end_a : NA/NaN argument.

When I check the entries of end_a separately for NAs using is.na(end_a), it does show the correct entries of the array as NA. So, that is not an issue.

I know I am missing something trivial. Any thoughts?

Bhargav Rao
  • 50,140
  • 28
  • 121
  • 140
Blade Runner
  • 263
  • 2
  • 11
  • 1
    There's a couple things going on here. Can you share small, illustrative data to make this reproducible? [Data sharing via simulation or `dput()` is strongly preferred](http://stackoverflow.com/q/5963269/903061). – Gregor Thomas Apr 15 '16 at 15:48
  • You say `begin` and `end_a` are arrays but you're using them as scalars, e.g. `begin:end_a`. – Ernest A Apr 15 '16 at 15:57
  • @ErnestA Gosh! That might be the error. I have arrays `begin=c(1,2,3,4)` and `end_a=c(10,15,NA,16)` and I want to take the mean of rows 1 to 10, 2 to 15 and 4 to 16. It turns out that `cbind` also does not allow this parallel indexing. – Blade Runner Apr 15 '16 at 16:26

1 Answers1

1

Try this:

means <- apply(na.omit(cbind(begin, end_a)), 1,
      function(x) mean(panelDataset[x[1]:x[2], 4]))
replace(end_a, !is.na(end_a), means)
Ernest A
  • 7,526
  • 8
  • 34
  • 40
  • Thanks! That works. Except that, I want to keep the NA indices too in the final result. For instance, in the example I gave above I want it to return an array of size 4 with the 3rd entry as NA. Your solution omits NAs so outputs a vector of size 3. If you can fix that in your answer, I will accept your answer. – Blade Runner Apr 15 '16 at 17:15
  • 1
    `apply(cbind(begin, end_a), 1, function(x) if(is.na(x[2]) NA else mean(panelDataset[x[1]:x[2]]))` – Jesse Anderson Apr 15 '16 at 17:32
  • @BladeRunner I changed the answer to produce a vector of the same length as the indexing arrays. – Ernest A Apr 15 '16 at 17:34
  • @ErnestA I was wondering if there is a vectorized way of doing in-place replacement, for example in the above example if instead of assigning the output to `means`, we want to assign it to `panelDataset[[x[1]:x[2],4]`. The obvious loop way of doing it works, but can't seem to find a vectorized way of doing it. (Assume that instead of `mean` we just multiply by -1, so that the dimensions stay the same) Any thoughts? Any help is greatly appreciated. – Blade Runner Apr 16 '16 at 01:43
  • @BladeRunner Well `a[b] <- c` is a vectorized operation, as long as `b` is a vector or an array. In this case `b` would need to be either a logical matrix the same size of `panelDataset` (inefficient for large matrices) or a 2-column array of indices. I think you'd benefit from reading the section on Indexing in the R manual, because indexing in R is quite different from Python, although they look similar. – Ernest A Apr 16 '16 at 08:42