Applying a function to each data.frame row and updating multiple column values

Question

I have a data.frame where each row is a tweet, and each row is an attribute ("text", "user", etc.).

I have written a function "processTweet()" that takes in a row of the data.frame and changes 3 columns in the tweet ("X", "Y" and "Z") and returns this modified single-row data.frame.

I'm currently trying to find out how to use something like dplyr or an apply-like function to actually reflect these modifications back in the original data.frame.

I'm aware that I could split the processTweet function into 3, but this would be inefficient since I'd have to do the same logical lookup multiple times.

I've tried using dplyr with rowwise, but I'm obviously doing something wrong, as the changes are not reflected in the tweets data.frame, whereas mutate seems to allow to modify one column, but not several: tweets %>% rowwise() %>% processTweet()

Please provide an example data and expected result based on that http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example — akrun, May 02 '15 at 16:24
Currently I'm making do with a loop, but surely there must be something more efficient and elegant than: for (i in seq_len(nrow(tweets))) { tweets[i, ] = processTweet(tweet[i, ]) } — alexvicegrab, May 02 '15 at 16:28
try `tweets = apply(tweets,1,processTweet )` or possibly `tweets = do.call(rbind,apply(tweets,1,processTweet ))`, or (probably better) `vectorize()` your processTweet function. Otherwise post an example dataset. — Jthorpe, May 02 '15 at 18:07
Thanks, the first returns a list an the second does not seem to work, but I seem to have found a solution — alexvicegrab, May 02 '15 at 20:10

score 0 · Answer 1 · edited May 23 '17 at 12:07

0

Seem to have found an answer using plyr

tweets = adply(.data = tweets, .margins = 1, .fun = processTweet)

but deployer implementation is still a mystery.

The following question/answer works when result is saved into a single column, but unclear what to do when we want to return a whole data.frame in the function Applying a function to every row of a table using dplyr?

edited May 23 '17 at 12:07

Community

1
1

answered May 02 '15 at 20:11

alexvicegrab

531
3
18

score 0 · Answer 2 · answered May 02 '15 at 21:11

0

After some trial and a lot of error, the ddplyr way that seems to work is:

tweets = as.data.frame(tweets %>% rowwise() %>% do(processTweet(.)) %>% rbind())

answered May 02 '15 at 21:11

alexvicegrab

531
3
18

Applying a function to each data.frame row and updating multiple column values

2 Answers2