I have the task of checking to see if values in a number of different columns appear in a character string in a field that contains an item name. If the values appear in this item name column, they need to be extracted and placed in a new column. I need to search one column at a time, so it will look like this: In column A, search for unique values from column B. I'll need to do this for a number of times where column A will always be the same, but the set of unique values from column B will be different because I'm using unique values from column B. Here's some example data:
Col_A <- c("blue shovel 1024", "red shovel 1022", "green bucket 3021",
"green rake 3021", "yellow shovel 1023")
Col_B <- c("blue", "red", "green", "blue", "yellow")
df <- data.frame(Col_A, Col_B)
print(df)
Col_A Col_B Col_C (output column)
1 blue shovel 1024 blue blue
2 red shovel 1022 red red
3 green bucket 3021 green green
4 green rake 3021 blue green
5 yellow shovel 1023 yellow yellow
In the above case, I want to search for the unique values from Col_B and Col_A and then if any are found, place them in a new column (Col_C). If it doesnt find a value, or the value isnt what is expected (i.e., row 4) that's ok. I'm just trying to figure out how to make this happen.
I've tried using mutate and str_extract like follows:
mutate(New_Col = str_extract(Col_A, unique_Col_B_vals))
But I'm not really having any luck. Sometimes it will return a value I would expect, and other times it returns a value that doesn't make sense. For reference, "unique_Col_B_vals" above is a data frame. Wondering if maybe that is part of the problem?
I'm not dead set on this approach, so if there is a far better way to search over a set of unique values from one column in another column, I am all ears. Thanks!
*Edit
The dataset I'm working with has a lot of issues with consistency. Values in Col_A are much longer in the dataset and are supposed to be made up of different values from multiple fields (basically like a concatenate), but we know this is not happening correctly in many cases. So I'm taking unique values from various fields (e.g. Col_B) and seeing if one of those unique values pops up in Col_A. If it does, I want to extract that and bring it to a new column (Col_C) so that I can compare what is in Col_B vs what was extracted from Col_A.
Also for clarity's sake, what I want to happen is that for each value in Col_A, search through all the unique values in Col_B and extract whatever is found to Col_C.
I've tried the following as well, but get an error:
uniquevals <- list(unique(df$Col_B))
df <- df %>%
mutate(Col_C = str_extract(Col_A, uniquevals))
Error: Problem with `mutate()` column `Col_C`.
i `Col_C = str_extract(Col_A, uniquevals)`.
x no applicable method for 'type' applied to an object of class "list"