3

I'm new to R and I've been doing ok so far but I need to do something a little complicated now and can't quite get it to work. I have a dataset similar to the following (going forward I will call this df):

df <- tribble(~name,             ~word,             ~N,
              "brandon",         "hello",            3,
               "john",           "test",             5,
               "jim",            "hello",            2,
               "brandon",        "goodbye",          2,
               "brandon",        "test",             1,
               "jim",            "goodbye",          4)

so far I have something like this going on:

temp_df <- df %>% mutate(
                     "hello" = ifelse(word == "hello", N, 0),
                     "goodbye" = ifelse(word == "goodbye", N, 0),
                     "test" = ifelse(word == "test", N, 0)
                  )

which is creating something like this:

name            hello           goodbye        test        word         N
brandon         3               0              0           hello        3
john            0               0              5           test         5
jim             2               0              0           hello        2
brandon         0               2              0           goodbye      2
brandon         0               0              1           test         1
jim             0               4              0           goodbye      4

but I need the df to look like this:

name            hello           goodbye        test
brandon         3               2              1
john            0               0              5
jim             2               4              0

I know how to select() the important data once I'm done here but I'm just not sure how to get all the data for each name into one row.

bcstryker
  • 456
  • 3
  • 15
  • It looks like you want to change the format of your data set from "long" to "wide". See https://stackoverflow.com/questions/5890584/how-to-reshape-data-from-long-to-wide-format – Marco Plebani Jun 18 '20 at 17:07

2 Answers2

3

Using dplyr:

df %>%
  pivot_wider(id_cols="name", names_from="word", values_from="N", values_fill=0)

yields

# A tibble: 3 x 4
  name    hello  test goodbye
  <chr>   <dbl> <dbl>   <dbl>
1 brandon     3     1       2
2 john        0     5       0
3 jim         2     0       4
Martin Gal
  • 16,640
  • 5
  • 21
  • 39
  • Thank you so much! works very well. I have a tiny problem still, when I substituted the data from my df (not exactly the one I put here because of DNA) I got the error ```Error in values_fill[[value]] : subscript out of bounds``` so I removed the values_fill=0 keyword arg and now everything works but the columns that would have 0 have NA. Any idea why that error is being raised? – bcstryker Jun 18 '20 at 17:22
  • 2
    Not sure, why this error occures. You can use `df[is.na(df)] <- 0` to replace all `NA` with 0. Or add `%>% mutate(across(everything(), ~ replace_na(., 0)))` to the code shown above. – Martin Gal Jun 18 '20 at 17:46
  • SICK! Thanks sooooo much I've been fighting with this for way too long. You're very knowledgable :) – bcstryker Jun 18 '20 at 17:54
  • @Mcmahoon89 Thank you for pointing me to the R documentation? – Martin Gal Jun 18 '20 at 18:22
  • 1
    No! That comment was for @bcstryker. Sorry. – Eric Jun 18 '20 at 18:25
  • @Mcmahoon89 lol got it thanks! I just couldn't find the right function to even look for documentation on – bcstryker Jun 18 '20 at 18:27
3

Data Frame

df <- tribble(~name,             ~word,             ~N,
              "brandon",         "hello",            3,
               "john",           "test",             5,
               "jim",            "hello",            2,
               "brandon",        "goodbye",          2,
               "brandon",        "test",             1,
               "jim",            "goodbye",          4)

Solution

library(dplyr) 
  df %>%
  pivot_wider(id_cols="name", names_from="word", values_from="N", 
  values_fill=0)

pivot_wider() "widens" data, increasing the number of columns and decreasing the number of rows. The inverse transformation is pivot_longer().

The help() function and ? help operator in R provide access to the documentation pages for R functions, data sets, and other objects, both for packages in the standard R distribution and for contributed packages. For example, help(pivot_wider) or ?pivot_wider.

Output

    name    hello   test    goodbye
    brandon 3       1       2   
    john    0       5       0   
    jim     2       0       4   
Eric
  • 2,699
  • 5
  • 17