1

I have a dataframe with a column named CHR which has discrete values from 1 to 18 (1, 2, 3 ...)

I want to subset the dataframes for each value of CHR. So far my code (working) looks like this:

CH1<-boxplot %>% filter(CHR == "1")
CH2<-boxplot %>% filter(CHR == "2")
CH3<-boxplot %>% filter(CHR == "3")
               .
               .
               .
CH18<-boxplot %>% filter(CHR == "18")

It does get the job done, but I'm very annoyed whenever my code looks like that. I want to learn one "proper" way so I can apply it to multiple other similar cases.

RFA_
  • 37
  • 4

3 Answers3

1

You have a few options:

1. Write a function, although you will still have many lines, they are condensed lines.

bx_filter <- function(boxplot, chr) {
  boxplot %>% filter(CHR == chr)
}

CH1 <- bx_filter("1")
CH2 <- bx_filter("2")

2. Use split(), where you'll get a list and each element of the list has the data frames you're looking for

split(boxplot, boxplot$CHR)

3. A combo of map() and assign(), although it's generally frowned upon to write to the Global environment in ways similar to this

unique(boxplot$CHR) %>%
  map(function(chr) {
    assign(paste0('CH', chr), boxplot %>% filter(CHR == chr), envir = .GlobalEnv)
  })
Harrison Jones
  • 2,256
  • 5
  • 27
  • 34
  • The 3rd one is what I was exactly looking for. Why would I be not recommended to write directly to Global env? And what would be an alternative? I'm thinking of a list, but I usually have had "bad" experiences with working from a list, specially when plotting. – RFA_ Jun 08 '22 at 11:47
  • 1
    This post is a great read to answer your question on `assign()`: https://stackoverflow.com/questions/54064394/when-is-rs-assign-function-appropriate – Harrison Jones Jun 08 '22 at 12:14
  • 1
    And my experiences with lists are that, at first it was very slow and tough (I understand your pains), but in the long run they became my go-to object for managing data such as the one in your question. – Harrison Jones Jun 08 '22 at 12:15
  • 1
    Another good question on `assign()`: https://stackoverflow.com/questions/17559390/why-is-using-assign-bad – Harrison Jones Jun 08 '22 at 12:16
  • I understand it. Could you provide an example/modification where instead of outputting the df to GlobEnv they would get outputted as a list? Thx – RFA_ Jun 08 '22 at 12:57
  • 1
    The 2nd option I provided did that, also the other answers from Carl and Sotos both return lists and are good options in my opinion. – Harrison Jones Jun 08 '22 at 14:45
1

group_split is one option:

library(tidyverse)

list <- tibble(chr = rep(1:18, 2), value = 1:36) |> 
  group_split(chr)

head(list)
#> <list_of<
#>   tbl_df<
#>     chr  : integer
#>     value: integer
#>   >
#> >[6]>
#> [[1]]
#> # A tibble: 2 × 2
#>     chr value
#>   <int> <int>
#> 1     1     1
#> 2     1    19
#> 
#> [[2]]
#> # A tibble: 2 × 2
#>     chr value
#>   <int> <int>
#> 1     2     2
#> 2     2    20
#> 
#> [[3]]
#> # A tibble: 2 × 2
#>     chr value
#>   <int> <int>
#> 1     3     3
#> 2     3    21
#> 
#> [[4]]
#> # A tibble: 2 × 2
#>     chr value
#>   <int> <int>
#> 1     4     4
#> 2     4    22
#> 
#> [[5]]
#> # A tibble: 2 × 2
#>     chr value
#>   <int> <int>
#> 1     5     5
#> 2     5    23
#> 
#> [[6]]
#> # A tibble: 2 × 2
#>     chr value
#>   <int> <int>
#> 1     6     6
#> 2     6    24

Created on 2022-06-08 by the reprex package (v2.0.1)

Carl
  • 4,232
  • 2
  • 12
  • 24
0

Loop over the CHR var

lapply(boxplot$CHR, function(i) filter(boxplot, CHR = i)
Sotos
  • 51,121
  • 6
  • 32
  • 66