Vectorized ifelse conundrum

Question

I have two arrays "begin" and "end_a" which contain some integer indices, except that some of the entries in "end_a" are NA.

And panelDataset is a matrix which contains the data. I want to take the means of the rows of panelDataset corresponding to non-NA entries of begin and end_a.

I have this working in serial fashion and it works fine, but when I tried to vectorize it as follows

switch_mu=ifelse(!is.na(end_a),mean(panelDataset[begin: end_a,4]),NA)

It gives an error: Error in begin:end_a : NA/NaN argument.

When I check the entries of end_a separately for NAs using is.na(end_a), it does show the correct entries of the array as NA. So, that is not an issue.

I know I am missing something trivial. Any thoughts?

There's a couple things going on here. Can you share small, illustrative data to make this reproducible? [Data sharing via simulation or `dput()` is strongly preferred](http://stackoverflow.com/q/5963269/903061). — Gregor Thomas, Apr 15 '16 at 15:48
You say `begin` and `end_a` are arrays but you're using them as scalars, e.g. `begin:end_a`. — Ernest A, Apr 15 '16 at 15:57
@ErnestA Gosh! That might be the error. I have arrays `begin=c(1,2,3,4)` and `end_a=c(10,15,NA,16)` and I want to take the mean of rows 1 to 10, 2 to 15 and 4 to 16. It turns out that `cbind` also does not allow this parallel indexing. — Blade Runner, Apr 15 '16 at 16:26

Ernest A · Accepted Answer · 2016-04-15T17:33:26.510

1

Try this:

means <- apply(na.omit(cbind(begin, end_a)), 1,
      function(x) mean(panelDataset[x[1]:x[2], 4]))
replace(end_a, !is.na(end_a), means)

edited Apr 15 '16 at 17:33

answered Apr 15 '16 at 17:03

Ernest A

7,526
8
34
40

Thanks! That works. Except that, I want to keep the NA indices too in the final result. For instance, in the example I gave above I want it to return an array of size 4 with the 3rd entry as NA. Your solution omits NAs so outputs a vector of size 3. If you can fix that in your answer, I will accept your answer. – Blade Runner Apr 15 '16 at 17:15
1

`apply(cbind(begin, end_a), 1, function(x) if(is.na(x[2]) NA else mean(panelDataset[x[1]:x[2]]))` – Jesse Anderson Apr 15 '16 at 17:32
@BladeRunner I changed the answer to produce a vector of the same length as the indexing arrays. – Ernest A Apr 15 '16 at 17:34
@ErnestA I was wondering if there is a vectorized way of doing in-place replacement, for example in the above example if instead of assigning the output to `means`, we want to assign it to `panelDataset[[x[1]:x[2],4]`. The obvious loop way of doing it works, but can't seem to find a vectorized way of doing it. (Assume that instead of `mean` we just multiply by -1, so that the dimensions stay the same) Any thoughts? Any help is greatly appreciated. – Blade Runner Apr 16 '16 at 01:43
@BladeRunner Well `a[b] <- c` is a vectorized operation, as long as `b` is a vector or an array. In this case `b` would need to be either a logical matrix the same size of `panelDataset` (inefficient for large matrices) or a 2-column array of indices. I think you'd benefit from reading the section on Indexing in the R manual, because indexing in R is quite different from Python, although they look similar. – Ernest A Apr 16 '16 at 08:42

Vectorized ifelse conundrum

1 Answers1