1

Problem

I'm working with a data frame similar to the extract generated below:

set.seed(1)
df <- data.frame(columnA1 = 1:10,
                 columnB1 = 1:10,
                 columnB99 = runif(n = 10))

I would like to create a set of columns that would contain custom flags corresponding to the values derived from columns that have 1 in the column name.

Approach

My present approach is summarised below:

require(dplyr); require(magrittr)
df %<>%
    mutate_each(funs(ifelse(. == 1, "val1",
                            ifelse(. == 10, "val10", NA))),
                contains("1"))

this generates the required values, however, does not create additional columns:

> head(df, n = 10)
   columnA1 columnB1  columnB99
1      val1     val1 0.26550866
2      <NA>     <NA> 0.37212390
3      <NA>     <NA> 0.57285336
4      <NA>     <NA> 0.90820779
5      <NA>     <NA> 0.20168193
6      <NA>     <NA> 0.89838968
7      <NA>     <NA> 0.94467527
8      <NA>     <NA> 0.66079779
9      <NA>     <NA> 0.62911404
10    val10    val10 0.06178627

Comments / Attempt 1

I also tried:
df %<>%
    mutate_each(funs(flg = ifelse(. == 1, "val1",
                            ifelse(. == 10, "val10", NA))),
                contains("1"))

but it generates the same result. Following this discussion, I'm guessing that I'm making mistakes in providing the suffix within the funs.


Comments Follow-up

For example the code:

df %<>%
    mutate_each(funs(ifelse(. == 1, "val1", NA),
                     ifelse(. == 10, "val10", NA)),
                contains("1"))
head(df, 10)

would create the additional columns but the results are not fully satisfactory:

> head(df, 10)
   columnA1 columnB1  columnB99 columnA1_ifelse columnB1_ifelse columnA1_ifelse_ifelse columnB1_ifelse_ifelse
1         1        1 0.26550866            <NA>            <NA>                     NA                     NA
2         2        2 0.37212390            <NA>            <NA>                     NA                     NA
3         3        3 0.57285336            <NA>            <NA>                     NA                     NA
4         4        4 0.90820779            <NA>            <NA>                     NA                     NA
5         5        5 0.20168193            <NA>            <NA>                     NA                     NA
6         6        6 0.89838968            <NA>            <NA>                     NA                     NA
7         7        7 0.94467527            <NA>            <NA>                     NA                     NA
8         8        8 0.66079779            <NA>            <NA>                     NA                     NA
9         9        9 0.62911404            <NA>            <NA>                     NA                     NA
10       10       10 0.06178627           val10           val10                     NA                     NA
Konrad
  • 17,740
  • 16
  • 106
  • 167
  • @akrun I was thinking that it would be possible to deviate from the default behaviour by providing that suffix. – Konrad Feb 05 '16 at 18:56
  • I tried your edited code on the original dataset 'df'. It didn't create the additional columns. I am using `dplyr_0.4.3` – akrun Feb 05 '16 at 19:06
  • @akrun I honestly don't know why. I tried one more time and it runs ([pastebin](http://pastebin.com/jQsGg0bM)). I'm also using `0.4.3` version of `dplyr`. – Konrad Feb 05 '16 at 19:18
  • 1
    Probably, it is the second run created that. I don't know if that is a bug, but certainly, the NA columns are not the expected – akrun Feb 05 '16 at 19:21
  • 1
    @akrun Thanks, it's interesting why it happens. – Konrad Feb 05 '16 at 20:48

1 Answers1

1

You can create additional columns when using only a single function in the funs argument if you supply a named vector to the vars or ... argument within mutate_each. Here's an example using setNames:

mutate_each(df, funs(ifelse(. == 1, "val1",
                            ifelse(. == 10, "val10", NA))),
                setNames(contains("1"), c("x", "y")))
#   columnA1 columnB1  columnB99     x     y
#1         1        1 0.26550866  val1  val1
#2         2        2 0.37212390  <NA>  <NA>
#3         3        3 0.57285336  <NA>  <NA>
#4         4        4 0.90820779  <NA>  <NA>
#5         5        5 0.20168193  <NA>  <NA>
#6         6        6 0.89838968  <NA>  <NA>
#7         7        7 0.94467527  <NA>  <NA>
#8         8        8 0.66079779  <NA>  <NA>
#9         9        9 0.62911404  <NA>  <NA>
#10       10       10 0.06178627 val10 val10

This is also described in another Q&A.

talat
  • 68,970
  • 21
  • 126
  • 157