18

I have searched exhaustively for a direct R translation for the FIRST. and LAST. pointers in SAS DATA steps but can't seem to find one. For those not familiar with SAS, FIRST. is a boolean that identifies the first appearance of a given element in a table and LAST. is a boolean that identifies the last appearance. For instance, consider the following sorted table:

V1    V2    V3
1     1     1
1     1     2
1     2     3
1     2     4
2     3     5
2     3     6
2     4     7
2     4     8
3     5     9
3     5     10
3     6     11
3     6     12

Because SAS DATA steps read tables line by line, I can use a statement like:

IF FIRST.V1 THEN DO ...

FIRST.V1 will return TRUE if and only if this is the first time the observation has been encountered in V1. In other words, it will return true for V1[1] (the first appearance of '1'), V1[5] (the first appearance of '2'), and V1[9] (the first appearance of '3'). The LAST. pointer functions in analogous fashion, but with the final appearance of that element.

Is there anything in R that emulates this?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
asteri
  • 11,402
  • 13
  • 60
  • 84
  • Maybe `duplicated()`? But it's hard to tell because you haven't really told us what you're actual goal is. – joran Jul 18 '12 at 17:13
  • there might be a more R-ish solution (e.g. with `ddply`) rather than looping through the data set a line at a time ... – Ben Bolker Jul 18 '12 at 17:21
  • I don't want to actually loop through the data.frame line by line. I just want a function that will return true if it is the first appearance of the value in that column and false otherwise. Also, one that returns true only if it is the last appearance of the value and false otherwise. Spacedman's solution below is more than sufficient for these purposes. – asteri Jul 18 '12 at 17:26

1 Answers1

25

You can do this with duplicated and rev (for LAST):

> v1=c(1,1,1,2,2,3,3,3,3,4,4,5)

> data.frame(v1,FIRST=!duplicated(v1),LAST=rev(!duplicated(rev(v1))))
   v1 FIRST  LAST
1   1  TRUE FALSE
2   1 FALSE FALSE
3   1 FALSE  TRUE
4   2  TRUE FALSE
5   2 FALSE  TRUE
6   3  TRUE FALSE
7   3 FALSE FALSE
8   3 FALSE FALSE
9   3 FALSE  TRUE
10  4  TRUE FALSE
11  4 FALSE  TRUE
12  5  TRUE  TRUE
Spacedman
  • 92,590
  • 12
  • 140
  • 224