Questions tagged [tidyr]

tidyr is an R package by Hadley Wickham for cleaning and reshaping data, designed to use the magrittr pipe (%>%) so as to interact well with dplyr and similar pipeable packages which emphasize tidy data. tidyr is the successor to reshape2.

tidyr is an package developed by Hadley Wickham and many others for cleaning and reshaping data, designed to use the pipe (%>%) so as to interact well with and similar pipeable packages which emphasize tidy data. tidyr is the successor to .

Links:

4200 questions
118
votes
4 answers

dplyr summarise: Equivalent of ".drop=FALSE" to keep groups with zero length in output

When using summarise with plyr's ddply function, empty categories are dropped by default. You can change this behavior by adding .drop = FALSE. However, this doesn't work when using summarise with dplyr. Is there another way to keep empty categories…
eipi10
  • 91,525
  • 24
  • 209
  • 285
116
votes
5 answers

Gather multiple sets of columns

I have data from an online survey where respondents go through a loop of questions 1-3 times. The survey software (Qualtrics) records this data in multiple columns—that is, Q3.2 in the survey will have columns Q3.2.1., Q3.2.2., and Q3.2.3.: df <-…
Andrew
  • 36,541
  • 13
  • 67
  • 93
108
votes
1 answer

R spreading multiple columns with tidyr

Take this sample variable df <- data.frame(month=rep(1:3,2), student=rep(c("Amy", "Bob"), each=3), A=c(9, 7, 6, 8, 6, 9), B=c(6, 7, 8, 5, 6, 7)) I can use spread from tidyr to change this to wide…
Ricky
  • 4,616
  • 6
  • 42
  • 72
75
votes
5 answers

pivot_wider issue "Values in `values_from` are not uniquely identified; output will contain list-cols"

My data looks like this: # A tibble: 6 x 4 name val time x1 1 C Farolillo 7 2016-04-20 51.5 2 C Farolillo 3 2016-04-21 56.3 3 C Farolillo 7 2016-04-22 56.3 4 C Farolillo 13…
user113156
  • 6,761
  • 5
  • 35
  • 81
75
votes
3 answers

How to replace all NA in a dataframe using tidyr::replace_na?

I'm trying to fill all NAs in my data with 0's. Does anyone know how to do that using replace_na from tidyr? From documentation, we can easily replace NA's in different columns with different values. But how to replace all of them with some value? I…
zesla
  • 11,155
  • 16
  • 82
  • 147
72
votes
4 answers

How can I spread repeated measures of multiple variables into wide format?

I'm trying to take columns that are in long format and spread them to wide format as shown below. I'd like to use tidyr to solve this with the data manipulation tools I'm investing in but to make this answer more general please provide other…
Tyler Rinker
  • 108,132
  • 65
  • 322
  • 519
72
votes
3 answers

Comparing gather (tidyr) to melt (reshape2)

I love the reshape2 package because it made life so doggone easy. Typically Hadley has made improvements in his previous packages that enable streamlined, faster running code. I figured I'd give tidyr a whirl and from what I read I thought gather…
Tyler Rinker
  • 108,132
  • 65
  • 322
  • 519
65
votes
11 answers

Using dplyr window functions to calculate percentiles

I have a working solution but am looking for a cleaner, more readable solution that perhaps takes advantage of some of the newer dplyr window functions. Using the mtcars dataset, if I want to look at the 25th, 50th, 75th percentiles and the mean and…
dreww2
  • 1,551
  • 3
  • 16
  • 18
64
votes
3 answers

Is it possible to use spread on multiple columns in tidyr similar to dcast?

I have the following dummy data: library(dplyr) library(tidyr) library(reshape2) dt <- expand.grid(Year = 1990:2014, Product=LETTERS[1:8], Country = paste0(LETTERS, "I")) %>% select(Product, Country, Year) dt$value <- rnorm(nrow(dt)) I pick two…
mpiktas
  • 11,258
  • 7
  • 44
  • 57
54
votes
8 answers

Reshaping multiple sets of measurement columns (wide format) into single columns (long format)

I have a dataframe in a wide format, with repeated measurements taken within different date ranges. In my example there are three different periods, all with their corresponding values. E.g. the first measurement (Value1) was measured in the period…
daj
  • 6,962
  • 9
  • 45
  • 79
48
votes
6 answers

Proper idiom for adding zero count rows in tidyr/dplyr

Suppose I have some count data that looks like this: library(tidyr) library(dplyr) X.raw <- data.frame( x = as.factor(c("A", "A", "A", "B", "B", "B")), y = as.factor(c("i", "ii", "ii", "i", "i", "i")), z = 1:6 ) X.raw # x y z # 1 A i 1 #…
pete
  • 2,327
  • 2
  • 15
  • 23
41
votes
6 answers

Unnest a list column directly into several columns

Can I unnest a list column directly into n columns? The list can be assumed to regular, with all elements being of equal length. If instead of a list column I would have a character vector, I could tidyr::separate. I can tidyr::unnest, but we need…
Axeman
  • 32,068
  • 8
  • 81
  • 94
38
votes
2 answers

Spread with data.frame/tibble with duplicate identifiers

The documentation for tidyr suggests that gather and spread are transitive, but the following example with the "iris" data shows they are not, but it is not clear why. Any clarification would be greatly appreciated iris.df =…
John D Lee
  • 381
  • 1
  • 3
  • 3
33
votes
2 answers

How to transpose a dataframe in tidyverse?

Using basic R, I can transpose a dataframe, say mtcars, which has all columns of the same class: as.data.frame(t(mtcars)) Or with pipes: library(magrittr) mtcars %>% t %>% as.data.frame How to accomplish the same within tidyr or tidyverse…
Irakli
  • 959
  • 1
  • 11
  • 18
32
votes
2 answers

How to use tidyr::separate when the number of needed variables is unknown

I've got a dataset that consists of email communication. An example: library(dplyr) library(tidyr) dat <- data_frame('date' = Sys.time(), 'from' = c("person1@gmail.com", "person2@yahoo.com", …
tblznbits
  • 6,602
  • 6
  • 36
  • 66
1
2 3
99 100