5

In R, is there a way to reference a vector from within the vector?

Say I have vectors with long names:

my.vector.with.a.long.name <- 1:10

Rather than this:

my.vector.with.a.long.name[my.vector.with.a.long.name > 5]

Something like this would be nice:

> my.vector.with.a.long.name[~ > 5]
[1]  6  7  8  9 10

Or alternatively indexing by a function would be convenient:

> my.vector.with.a.long.name[is.even]
[1]  2  4  6  8 10

Is there a package that already supports this?

andrew
  • 2,524
  • 2
  • 24
  • 36
  • Not sure why can't you create your own functions that will do what suggest, e.g., `myFunc <- function(x, y) x[x > y] ; is.even <- function(x) x[(x %% 2) == 0] ; myFunc(my.vector.with.a.long.name, 5) ; is.even(my.vector.with.a.long.name)` – David Arenburg Aug 26 '14 at 14:30
  • 2
    or use tab to autocomplete your (needlessly) long names? – rawr Aug 26 '14 at 14:31
  • Not really interested in solutions the specific examples I gave, just wondering if this syntax is supported already in some way. – andrew Aug 26 '14 at 14:33
  • It's a bit what the package `data.table` does for data frames. You can start by taking a look there? – Joris Meys Aug 26 '14 at 14:33
  • 1
    Not tried, but maybe `require(magrittr); my.stupid.name %>% [>5]` or similar to that? – Carl Witthoft Aug 26 '14 at 14:38
  • @CarlWitthoft, was thinking the same but couldn't find the correct syntax – David Arenburg Aug 26 '14 at 14:39
  • 1
    A [related](http://stackoverflow.com/q/5738831/324364) question which might provide some ideas on how you'd go about implementing this if you wanted. – joran Aug 26 '14 at 14:40
  • Thanks for the feedback, `magrittr` seems to offer the closest option. May try hacking the actual feature together at some point. – andrew Aug 26 '14 at 14:51
  • 1
    @CarlWitthoft I don't think it will work with magrittr, see: http://renkun.me/blog/2014/08/08/difference-between-magrittr-and-pipeR.html – James Aug 26 '14 at 14:57

4 Answers4

8

You can use pipes which allow self-referencing with .:

library(pipeR)
my.vector.with.a.long.name %>>% `[`(.>5)
[1]  6  7  8  9 10
my.vector.with.a.long.name %>>% `[`(.%%2==0)
[1]  2  4  6  8 10
James
  • 65,548
  • 14
  • 155
  • 193
  • 1
    Yep. My solution was `foo %>>% (.[(.>5)])` which does the same as yours, I believe... but you win the CodeGolf here :-) – Carl Witthoft Aug 26 '14 at 15:21
5

The Filter function helps with this

my.vector.with.a.long.name <- 1:10
Filter(function(x) x%%2==0, my.vector.with.a.long.name)

or

is.even <- function(x) x%%2==0
Filter(is.even, my.vector.with.a.long.name)
MrFlick
  • 195,160
  • 17
  • 277
  • 295
0

So, you're basically asking if you can use something other than the variable's name to refer to it. The short answer is no. That is the whole idea behind variable names. If you want a shorter name, name it something shorter.

The longer answer is it depends. You're really just using logical indexing in its long form. To make it shorter/refer to it more than once without having to type that enormous name, just save it in a vector like so:

gt5 <- my.vector.with.a.long.name > 5
[1] FALSE FALSE FALSE FALSE FALSE TRUE...

my.vector.with.a.long.name[gt5]
[1] 6 7 8 9 10

You can do the same thing with a function as long as it returns the indexes or a logical vector.

The dplyr package allows you to do some cool chaining things, where you use the %.% operator to take the LHS of the operator and input into the first argument of the RHS function call.

It's cool to use in the dplyr package by saying things like:

data %.% group_by(group.var) %.% summarize(Mean=mean(ID))

instead of:

summarize(group_by(data, group.var), Mean=mean(ID)).
MentatOfDune
  • 309
  • 1
  • 9
  • 1
    You can, but this is extremely wasteful of memory since AFAIK a whole new object `gt5` is created when it's entirely unnecessary to do so. – Carl Witthoft Aug 26 '14 at 14:37
  • Yes, gt5 creates a new logical vector, but obviously this is a trivial example. If you're going to re-use the indexing or it's complicated to create, I like to create these indexing objects. – MentatOfDune Aug 26 '14 at 14:44
  • 1
    @Carl Witthoft: Do the profiling and you'll see that exactly the same happens when you use the code given above. This is exactly the same as the code of OP, except spelled out explicity. So the memory footprint is the same afaik. – Joris Meys Aug 26 '14 at 14:44
  • 1
    @JorisMeys but the vector `my.vector.with.a.long.name > 5` gets removed at the next garbage collection, where as `gt5` doesn't. – Señor O Aug 26 '14 at 14:45
  • @SeñorO Which is why I always stress in my class that temporary objects serve readability more than anything else, and should be removed whenever possible. Not only for memory issues, also to avoid bugs with generic counters called `i` for example. Next to that, it might be beneficial to keep it, in order to avoid having to recalculate the same thing a couple of times (as you see often happening in code, especially with the constructs OP refers to). – Joris Meys Aug 26 '14 at 14:47
  • @JorisMeys Exactly. I frequently create constructs like this that I use to refer to sets of indicies multiple times. If it's a one off thing like the gt5, I will just type it. It's not that much of an encumbrance with tab completion. – MentatOfDune Aug 26 '14 at 15:02
  • Has someone remapped `>` on your set-up? The results do not match. – James Aug 26 '14 at 15:05
  • @James No, I typed this all in by hand. I'm an idiot and typed the results for less than instead of greater than. – MentatOfDune Aug 26 '14 at 15:30
0

You can easily create another object with a shorter name:

my.vector.with.a.long.name <- 1:10

mm = my.vector.with.a.long.name 

mm
 [1]  1  2  3  4  5  6  7  8  9 10

mm[mm<5]
[1] 1 2 3 4

mm[mm>5]
[1]  6  7  8  9 10

Why use other packages and complex code?

rnso
  • 23,686
  • 25
  • 112
  • 234