Using pipeline features of dplyr to export data as data.frame object in R

Question

I have data in data.frame and I am gonna try pipeline feature of dplyr packages to do few pipeline operation in R. For example, given dataframe objects, first I will do subset, then export as csv files format. I am studying the feature of dplyr packages, so not perfectly understand this. Any help ? Here is the simple reproducible example for simulation:

a <- GRanges(
  seqnames=Rle(c("chr1", "chr2", "chr3", "chr4"), c(3, 2, 1, 2)),
  ranges=IRanges(seq(1, by=9, len=8), seq(7, by=9, len=8)),
  rangeName=letters[seq(1:8)], score=sample(1:20, 8, replace = FALSE))

I do subsetting first:

a %>% subset(pvalue < 1e-4 & pvalue > 1e-9)

then wants to do several pipeline operation by using feature of dplyr:

a %>% subset(pvalue < 1e-4 & pvalue > 1e-9) %>% write.table(x, "foo.csv") %>% as.data.frame(x)

but I have an error when I do second step. If I need to do several pipeline work like result of first is used in the second, how can I proceed this in R by using dplyr packages ? Thanks

If you need to write out in the middle of a chain, use magrittr's `%T>%` — Frank, Jun 15 '16 at 10:45

score 3 · Answer 1 · edited May 23 '17 at 11:58

3

Using iris, to make your example reproducible you can:

iris %>% filter(Sepal.Length > 5.2) %>% write.table("foo.csv")

Some side remarks:

subset is more a base R approach. Why not using dplyr's verbs, eg filter, select, etc. ?
The pipe arguments (it's more a magrittr than a dplyr operator now), throws the left hand side as the first argument on the right hand side, so write.table(x, ...) cannot work as intended.
as dplyr works with data.frames, you do not need as.data.frame

edited May 23 '17 at 11:58

Community

1
1

answered Jun 15 '16 at 10:42

Vincent Bonhomme

7,235
2
27
38

Thank you for this quick respond. Is there any existing thread to explain pipeline feature of dplyr more specifically? I am wondering I need to look at it and learn. – Jun 15 '16 at 11:02
See link above/[`magrittr`'s vignette](https://cran.r-project.org/web/packages/magrittr/vignettes/magrittr.html). The perfect place to start ;-) – Vincent Bonhomme Jun 15 '16 at 11:12
Thanks, I will go through with it. – Jun 15 '16 at 12:59

score 0 · Accepted Answer · answered Jun 15 '16 at 11:50

0

If you want to extract several different subsets and write them out, you may want to use group_by and do. First create a categorical variable that splits up your data into the subsets you want. Here's an example that works:

iris %>% mutate(
        slcat    = cut(Sepal.Length, c(0, 4, 5, 6, 8)),
        filename = paste0("file", slcat, ".csv")
      ) %>% 
      group_by(slcat) %>% 
      do(result = write.csv(., file = .$filename[1]))

answered Jun 15 '16 at 11:50

I already accept the answer from Vincent. In this circumstances, should I accept yours also? Both of answer is feasible for me. I thank you – Jun 15 '16 at 12:48
this one is fine ;-) – Vincent Bonhomme Jun 15 '16 at 13:01

Using pipeline features of dplyr to export data as data.frame object in R

2 Answers2