I'm trying to create a new column in my tibble which collects and formats all words found in all other columns. I would like to do this using dplyr, if possible. Original DataFrame:
df <- read.table(text = " columnA columnB
1 A Z
2 B Y
3 C X
4 D W
5 E V
6 F U " )
As a simplified example, I am hoping to do something like:
df %>%
rowwise() %>%
mutate(newColumn = myFunc(.))
And have the output look like this:
columnA columnB newColumn
1 A Z AZ
2 B Y BY
3 C X CX
4 D W DW
5 E V EV
6 F U FU
When I try this in my code, the output looks like:
columnA columnB newColumn
1 A Z ABCDEF
2 B Y ABCDEF
3 C X ABCDEF
4 D W ABCDEF
5 E V ABCDEF
6 F U ABCDEF
myFunc should take one row as an argument but when I try using rowwise() I seem to be passing the entire tibble into the function (I can see this from adding a print function into myFunc).
How can I pass just one row and do this iteratively so that it applies the function to every row? Can this be done with dplyr?
Edit:
myFunc in the example is simplified for the sake of my question. The actual function looks like this:
get_chr_vector <- function(row) {
row <- row[,2:ncol(row)] # I need to skip the first row
words <- str_c(row, collapse = ' ')
words <- str_to_upper(words)
words <- unlist(str_split(words, ' '))
words <- words[words != '']
words <- words[!nchar(words) <= 2]
words <- removeWords(words, stopwords_list) # from the tm library
words <- paste(words, sep = ' ', collapse = ' ')
}