0

My data looks like this:

proposal_number <- c("Expt 1", "Expt 1", "Expt 1", "Expt 2", "Expt 2")
crop_weight <- c("Winter Wheat 200g", "Winter Barley 200g", "Spring Beans 500g", "Winter Wheat 300g", "Spring Beans 100g")
data<-data.frame(proposal_number, crop_weight)

  proposal_number        crop_weight
1          Expt 1  Winter Wheat 200g
2          Expt 1  Winter Barley 200g
3          Expt 1  Spring Beans 500g
4          Expt 2  Winter Wheat 300g
5          Expt 2  Spring Beans 100g

And I want to collapse the levels of crop_weight by proposal_number so it looks like this:

  proposal_number                                              crop_weight
1          Expt 1 Winter Wheat 200g, Winter Barley 200g, Spring Beans 500g
2          Expt 2                     Winter Wheat 300g, Spring Beans 100

I have got this far:

help <-data %>% 
  group_by(proposal_number) %>%
  mutate(crop_weight = paste(crop_weight, collapse = "; ")) 

But it pastes all the factor levels of crop_weight for both factor levels of proposal_number, like this:

 proposal_number                                                                                    crop_weight
1          Expt 1 Winter Wheat 200g, Winter Barley 200g, Spring Beans 500g, Winter Wheat 300g, Spring Beans 100g
2          Expt 2 Winter Wheat 200g, Winter Barley 200g, Spring Beans 500g, Winter Wheat 300g, Spring Beans 100g

I've been googling it all morning and maybe my search terms are wrong, but I can't find an obvious answer. I'd be very grateful for any insights as to where I am going wrong?

This is the most useful thread I have found, but as I say above it doesn't quite work... Concatenate strings by group with dplyr

Many thanks

  • 1
    Use `dplyr::summarise` instead of `mutate`. `help <-data %>% group_by(proposal_number) %>% dplyr::summarise(crop_weight = paste(crop_weight, collapse = "; ")) ` – Ronak Shah Jul 12 '21 at 13:57
  • Thanks Ronak! I get the same error with summarise as I do with mutate - it pastes all the factor levels of crop_weight for both factor levels of proposal_number. Specifying the library helps though - without that it collapses into a single observation as described in my reply to MonJeanJean below. – Aislinn Pearson Jul 12 '21 at 14:18

1 Answers1

0

Use summarise instead of mutate:

data %>% 
  group_by(proposal_number) %>% 
  summarise(crop_weight = paste0(crop_weight,collapse = ","))

Output:

  proposal_number crop_weight                                           
  <chr>           <chr>                                                 
1 Expt 1          Winter Wheat 200g,Winter Barley 200g,Spring Beans 500g
2 Expt 2          Winter Wheat 300g,Spring Beans 100g      
MonJeanJean
  • 2,876
  • 1
  • 4
  • 20
  • I agree. That should work but when I tried it (copying and pasting your code) the whole data frame collapsed into a single observation: `crop_sample_size 1 winter wheat: 2,000g,winter oilseed rape: 2,000g,spring barley: 2,000g,spring beans: 2,000g,spring linseed: 2,000g,winter wheat: 1,000g,winter oats: 1,000g,winter beans: 2,000g,camelina: Whole Plot,camelina: Whole Plot,winter wheat: Whole Plot,winter barley: Whole Plot,winter wheat: 2,000g,winter oilseed rape: 2,000g,...` – Aislinn Pearson Jul 12 '21 at 14:14
  • That's odd because I ran it again and it perfectly works. I used the data you provided, that might be an issue and be different than the data you actually use – MonJeanJean Jul 12 '21 at 14:24
  • As you just answered above, that might be a package confict issue, using the summarise or group_by function from another package and not from dplyr. – MonJeanJean Jul 12 '21 at 14:26
  • I found the problem. What I wrote above is only a portion of my code - really `crop_weight` is a combination of two columns which I made this code: `unite(., col = "crop_weight", crop, weight, na.rm=TRUE, remove = TRUE, sep = ": ")`. So I separated the call out into two data frames (one that unites the columns, and one which uses your code to collapse the factor levels) and voila! It worked as it should! Do you think that is a bug and I should report it, or just a strange error in my code? – Aislinn Pearson Jul 12 '21 at 14:29
  • Ok nice! That ain't a bug, that's normal given the un-combination of your columns and what is called inside `paste0`. – MonJeanJean Jul 12 '21 at 14:34