0

In R, I defined the following function:

race_ethn_tab <- function(x) {
  x %>%
  group_by(RAC1P) %>%
  tally(wt = PWGTP) %>%
  print(n = 15) }

The function simply generates a weighted tally for a given dataset, for example, race_ethn_tab(ca_pop_2000) generates a simple 9 x 2 table:

1     Race 1 22322824
2     Race 2  2144044
3     Race 3   228817
4     Race 4     1827
5     Race 5    98823
6     Race 6  3722624
7     Race 7   116176
8     Race 8  3183821
9     Race 9  1268095

I have to do this for several (approx. 10 distinct datasets) where it's easier for me to keep the dfs distinct rather than bind them and create a year variable. So, I am trying to use either a for loop or purrr::map() to iterate through my list of dfs.

Here is what I tried:

    dfs_test <- as.list(as_tibble(ca_pop_2000), 
                        as_tibble(ca_pop_2001), 
                        as_tibble(ca_pop_2002), 
                        as_tibble(ca_pop_2003), 
                        as_tibble(ca_pop_2004))

# Attempt 1: Using for loop

    for (i in dfs_test) {
      race_ethn_tab(i)
    }

# Attempt 2: Using purrr::map

    race_ethn_outs <- map(dfs_test, race_ethn_tab)

Both attempts are telling me that group_by can't be applied to a factor object, but I can't figure out why the elements in dfs_test are being registered as factors given that I am forcing them into the tibble class. Would appreciate any tips based on my approach or alternative approaches that could make sense here.

Igor G
  • 25
  • 6
  • What exactly is in `ca_pop_2000`, `ca_pop_2001` etc? Your function references `RAC1P` but I don't see that anywhere else. It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Oct 02 '20 at 00:22
  • 2
    You code should work if you use `list` instead of `as.list`. See output of `as.list(as_tibble(mtcars), as_tibble(iris))` vs `list(as_tibble(mtcars), as_tibble(iris))` – Ronak Shah Oct 02 '20 at 00:23
  • Thank you, @RonakShah! That did the trick. – Igor G Oct 05 '20 at 22:11

2 Answers2

0

This, from @RonakShah, was exactly what was needed:

You code should work if you use list instead of as.list. See output of as.list(as_tibble(mtcars), as_tibble(iris)) vs list(as_tibble(mtcars), as_tibble(iris)) – Ronak Shah Oct 2 at 0:23

Igor G
  • 25
  • 6
-1

We can use mget to return a list of datasets, then loop over the list and apply the function

dfs_test <- mget(paste0("ca_pop_", 2000:2004))

It can be also made more general if we use ls

dfs_test <- mget(ls(pattern = '^ca_pop_\\d{4}$'))
map(dfs_test, race_ethn_tab)

This would make it easier if there are 100s of objects already created in the global environment instead of doing

list(ca_pop_2000, ca_pop_2001, .., ca_pop_2020)
akrun
  • 874,273
  • 37
  • 540
  • 662