I want to combine two dataframes of different sizes with a binary function (namely str_count()
), such that the rows of df1
(containing regex) become columns of df2
(containing the text data).
library(dplyr)
# dummy data
df1 <-
tribble(
~regex_name, ~regex_data
, "reg1", "(\\w+ )"
, "reg2", "\\d+"
)
df2 <-
tribble(
~metadata, ~text
, "meta1", "text 1"
, "meta2", "text2 3 4"
)
# should result in something like
df1_2 <-
tribble(
~metadata, ~text, ~reg1, ~reg2
, "meta1", "text 1", 1, 2
, "meta2", "text2 3 4", 0, 3
)
What I've tried so far
After searching online for a bit, I think there are a few possible approaches that I could take that involves some problems or perhaps some unnecessary intermediate steps.
- a. Use a
full_join
( joinby=
what tho?) b. Followed bytidyr::spread()
, (orpivot_wider()
??) - Use
purrr::cross2()
(orcross_dfr()
) (but it gives the wrong structure?) followed by (b1.b) - Use some combination of
purrr::map2()
andmutate
(I've not been able to get this to work properly, andmap2
requires the dataframes to be of the same length)
The use of regex is just as an example (also what I'm working with).
Also, although I'm using tidyverse
libraries, any other elegant(simple?) solution that works is fine (I'm just prone to make mistakes if there are too many intermediate steps).