39

Base R defines an identity function, a trivial identity function returning its argument (quoting from ?identity).

It is defined as :

identity <- function (x){x}

Why would such a trivial function ever be useful? Why would it be included in base R?

Andrie
  • 176,377
  • 47
  • 447
  • 496
  • 1
    I've seen it used in the context of `curve(identity(x))` (rather than the slightly (?) more opaque `curve(x*1)` or `curve(x+0)` ... – Ben Bolker Aug 18 '11 at 14:27
  • @BenBolker Why not simply `curve(x)`? – Andrie Aug 18 '11 at 14:28
  • 1
    try it -- it doesn't work (`Error in eval(expr, envir, enclos) : could not find function "x"`) because `curve` uses funny evaluation rules ... – Ben Bolker Aug 18 '11 at 14:32
  • 1
    The first few answers allude to functional programming. Some useful questions about R and functional programming: http://stackoverflow.com/q/4874867/602276, http://stackoverflow.com/q/6167791/602276 and http://stackoverflow.com/q/2228544/602276 – Andrie Aug 18 '11 at 14:40

8 Answers8

18

Don't know about R, but in a functional language one often passes functions as arguments to other functions. In such cases, the constant function (which returns the same value for any argument) and the identity function play a similar role as 0 and 1 in multiplication, so to speak.

Ingo
  • 36,037
  • 5
  • 53
  • 100
  • 2
    Can you please explain this a bit more? I understand the bit about being able to pass functions as arguments. But what do you mean with it *plays the role of 0 and 1*? – Andrie Aug 18 '11 at 14:30
  • And R is a relatively functional language. At least, it has first class functions, closures, focus on immutable data structures, maps and filters, anonymous functions etc. So, yes, this is probably why `identity` is included. – Wilduck Aug 18 '11 at 14:31
  • A good example of functions that take functions as arguments in R is the `apply` family of functions: http://www.ats.ucla.edu/stat/r/library/advanced_function_r.htm#apply . These functions are very powerful for manipulating data sets. – Wilduck Aug 18 '11 at 14:34
  • @Andrie - regarding 0 and 1: think multiplication. 0*n = 0, 1*n = n – Ingo Aug 18 '11 at 14:39
  • @Ingo I am familiar with the mathematical concept of identity. I have also used many of the functions in R that make use of passing functions as arguments (`apply`, `ddply`, `aggregate`, `outer` and a multitude of others). But I have never had the need for the identify function myself. So my question is still: in functional programming, why would you need an identify function? What is the practical use of it? PS. Your answer doesn't have to be specifically about R, since this seems like a more generic concept. – Andrie Aug 18 '11 at 14:44
  • 2
    The practical use of such a thing can be, for example, as a default value for a parameter that is of type 'transformation'. This is similar to having a parameter of type 'multiplicative factor' and setting its default value to 1.0. In both cases, the default value has the effect of being a no-op. It is useful not to have to specify this explicitly (say by setting a boolean do_not_transform), but rather implicitly, simply as a property of the parameter (namely, it acts as the identity operator). – micans Aug 18 '11 at 14:59
  • +1 @Ingo and @micans Thank you. It's starting to make sense to me, especially when read together with the `apply` example provided by @gsk3 – Andrie Aug 18 '11 at 15:01
  • 1
    @Andrie - well I can give an example in Haskell. Say I have a value of type Maybe Int, and I am interested in the number. There is a function maybe d f x = case x of { Nothing -> d; Just i -> f i } which I can use to extract it. It gives me the result of applying a function to the payload or a default value if x was Nothing. So, if I want the value unchanged, I just pass id (which is the name of the identity function in haskell). I know this sounds conrtrieved to someone who never felt the need, but then, maybe you are just not programming "functional" enough yet. – Ingo Aug 18 '11 at 15:02
13

I use it from time to time with the apply function of commands.

For instance, you could write t() as:

dat <- data.frame(x=runif(10),y=runif(10))
apply(dat,1,identity)

       [,1]      [,2]      [,3]      [,4]      [,5]      [,6]       [,7]
x 0.1048485 0.7213284 0.9033974 0.4699182 0.4416660 0.1052732 0.06000952
y 0.7225307 0.2683224 0.7292261 0.5131646 0.4514837 0.3788556 0.46668331
       [,8]      [,9]      [,10]
x 0.2457748 0.3833299 0.86113771
y 0.9643703 0.3890342 0.01700427
Ari B. Friedman
  • 71,271
  • 35
  • 175
  • 235
  • 9
    +1 In other words, `identity` is used as a `no-operation` swtich to really make use of the transforming qualities of the function being called. Nice. – Andrie Aug 18 '11 at 15:03
  • @Andrie Exactly. And stated more elegantly than I could have mustered :-) – Ari B. Friedman Aug 18 '11 at 15:09
  • It's not me, guv. I'm paraphrasing from a deleted answer by @tripleee which I actually found very useful to give context to the other answers. – Andrie Aug 18 '11 at 15:13
9

One use that appears on a simple code base search is as a convenience for the most basic type of error handling function in tryCatch.

tryCatch(...,error = identity)

which is identical (ha!) to

tryCatch(...,error = function(e) e)

So this handler would catch an error message and then simply return it.

joran
  • 169,992
  • 32
  • 429
  • 468
  • I'll have to do some more thinking before this makes sense to me. I've always avoided to learn how `try` and `tryCatch` works. Thanks for the example and reference. – Andrie Aug 18 '11 at 15:11
6

For whatever it's worth, it is located in funprog.R (the functional programming stuff) in the source of the base package, and it was added as a "convenience function" in 2008: I can imagine (but can't give an immediate example!) that there would be some contexts in the functional programming approach (i.e. using Filter, Reduce, Map etc.) where it would be convenient to have an identity function ...

r45063 | hornik | 2008-04-03 12:40:59 -0400 (Thu, 03 Apr 2008) | 2 lines

Add higher-order functions Find() and Position(), and convenience
function identity().
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
2

Stepping away from functional programming, identity is also used in another context in R, namely statistics. Here, it is used to refer to the identity link function in generalized linear models. For more details about this, see ?family or ?glm. Here is an example:

> x <- rnorm(100)
> y <- rpois(100, exp(1+x))
> glm(y ~x, family=quasi(link=identity))

Call:  glm(formula = y ~ x, family = quasi(link = identity))

Coefficients:
(Intercept)            x
      4.835        5.842

Degrees of Freedom: 99 Total (i.e. Null);  98 Residual
Null Deviance:      6713
Residual Deviance: 2993         AIC: NA

However, in this case parsing it as a string instead of a function will achieve the same: glm(y ~x, family=quasi(link="identity"))

EDIT: As noted in the comments below, the function base::identity is not what is used by the link constructor, and it is just used for parsing the link name. (Rather than deleting this answer, I'll leave it to help clarify the difference between the two.)

nullglob
  • 6,903
  • 1
  • 29
  • 31
  • 1
    This was mentioned in a (now deleted) answer...`identity` in this context apparently does not actually refer to `base::identity`. See the code in `make.link`; it's just matching the name "identity". – joran Aug 18 '11 at 15:43
  • 1
    @nullglob : Indeed. The answer was mine and I deleted it, because it is incorrect. The family constructors for glm don't use the identity function, they evaluate a string (even though you can pass the argument unquoted). – Joris Meys Aug 18 '11 at 20:27
1

Here is usage example:

    Map<Integer, Long> m = Stream.of(1, 1, 2, 2, 3, 3)
            .collect(Collectors.groupingBy(Function.identity(),
                    Collectors.counting()));
    System.out.println(m);
    output:
    {1=2, 2=2, 3=2}

here we are grouping ints into a int/count map. Collectors.groupingBy accepts a Function. In our case we need a function which returns the argument. Note that we could use e->e lambda instead

Evgeniy Dorofeev
  • 133,369
  • 30
  • 199
  • 275
1

I just used it like this:

fit_model <- function(lots, of, parameters, error_silently = TRUE) {

  purrr::compose(ifelse(test = error_silently, yes = tryNA, no = identity),
                 fit_model_)(lots, of, parameters)
}

tryNA <- function(expr) {
  suppressWarnings(tryCatch(expr = expr,
                            error = function(e) NA,
                            finally = NA))
}
rcorty
  • 1,140
  • 1
  • 10
  • 28
1

As this question has already been viewed 8k times it maybe worth updating even 9 years after it has been written.

In a blog post called "Simple tricks for Debugging Pipes (within magrittr, base R or ggplot2)" the author points out how identity() can be very usefull at the end of different kinds of pipes. The blogpost with examples can be found here: https://rstats-tips.net/2021/06/06/simple-tricks-for-debugging-pipes-within-magrittr-base-r-or-ggplot2/

If pipe chains are written in a way, that each "pipe" symbol is at the end of a line, you can exclude any line from execution by commenting it out. Except for the last line. If you add identity() as the last line, there will never be a need to comment that out. So you can temporarily exclude any line that changes the data by commenting it out.

Bernhard
  • 4,272
  • 1
  • 13
  • 23