0

Using DPLYR and TIDYR, I'm trying to create a tidy version of a dataset where rows can be missing depending on the data of certain columns. I created a function that returns the rows missing (by creating them with default data) in a new tbl_df(data.frame) (I unit-tested it and it works okay with specific data).

However, when calling it from 'bind_rows', I get the following error: Error in data.frame(a, b, c,...: Object 'A' not found.

For example, my data looks like this:

A        B        C        D        E        ...
a1       b1       c1       d1       e1       ...
a2       b2       c2       d2       e2       ...
...

My code looks like this:

data_tidy <- data %>%

    <some other functions to clean up like 'mutuate', 'filter', etc.> %>%

    brind_rows(myCustomFunction(A, B, C, D, E... ))

Any ideas what I'm doing wrong? I'm still new to R, DPLYR/TIDYR...

Note: If I remove the last call to 'bind_rows', the table is cleanup as expected with the proper A, B, C, etc. columns. I also use a 'for' loop in this specific scenario which I know might not be optimal but for now, I will work with this version so I can get it to work and then try to optimize my code (or vectorize).

Thanks!

MrFlick
  • 195,160
  • 17
  • 277
  • 295
Martin
  • 13
  • 4
  • Why are you trying to pass a function to `bind_rows()`? What do you think that's going to do? It would help if you made your problem [reproducible](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input data and expected output for that input. I'm guessing you're using the wrong function here. – MrFlick Feb 26 '15 at 21:29
  • Each row within the data describe a contract year for a player. However, the original data I used to clean/tidy up was stopping at 2018 for every player but some contract goes further (up to 2022 for example). The custom method creates the rows missing (in my example 4 rows would be created - 2019, 2020, 2021, 2022 with the relevant data for other variables). The method I'm calling is supposed to be executed to return those missing rows and then use bind_rows to add them to the original tidy data. Hopefully that makes sense. – Martin Feb 27 '15 at 15:33

1 Answers1

1

In your call to foo %>% brind_rows(myCustomFunction(A, B, C, D, E... )), myCustomFunction(A, B, C, D, E... ) is being called as an ordinary R function, whereas I think you'r expecting that it be evaluted within the context of a dplyr function as in mutate(x = myCustomFunction(A, B, C, D, E... )) where the arguments A, B, C, D, E would be replaced by fields from the data.frame that is passed as the implicit first argument thanks to the %>% operator.

In short, you need to call myCustomFunction(A, B, C, D, E... ) in such a way that the arguments are scoped correctly, such as:

data_tidy <- data %>% 
    <some other functions to clean up like 'mutuate', 'filter', etc.>

brind_rows(do.call(myCustomFunction,data_tidy))
Jthorpe
  • 9,756
  • 2
  • 49
  • 64
  • Thanks - I will investigate that option. Using the incorrect environment sounds like a good idea. – Martin Feb 27 '15 at 15:37