What I have:
From my original observations …
video_id user_id keyword
1 1 foo
2 1 bar
3 1 baz
4 1 yak
1 2 foo
2 2 bar
3 2 blah
4 2 yak
1 3 foo
2 3 bar
3 3 blah
4 3 yak
… I have a table with frequencies (called tab
), and it is displayed in the exact format I want, e.g.
video_id foo bar baz yak blah
1 4 0 0 0 0
2 0 4 0 0 0
3 0 0 2 0 2
4 0 0 0 4 0
I'd like to merge this table with an existing data frame (called data
), based on the ID column. So, for example, it contains two other columns as well:
video_id col1 col2
1 123 412
2 652 633
3 749 144
4 1738 1763
What I need:
I need to merge the frequency table and the existing data frame based on the video ID. Note that it is not necessarily sorted, so I can't just cbind
them. This is the result I need:
video_id col1 col2 foo bar baz yak blah
1 123 412 4 0 0 0 0
2 652 633 0 4 0 0 0
3 749 144 0 0 2 0 2
4 1738 1763 0 0 0 4 0
Now, I know I can get a data frame matrix like this:
as.data.frame.matrix(table(…))
But this matrix is missing the video_id
column, which is actually displayed when I just view the table. So, how do I go about getting a data frame that still includes the video_id
column—or row names, that is?
I need the video_id
column to come first in the data frame, then the original columns, and then the tabular data appended, as seen in the example above.
What I've tried:
I know I can get the table's row names through
rownames(table(…))
, and I can get the result I want withcbind(data.frame(video_id=rownames(tab)), as.data.frame.matrix(tab))
But this doesn't seem clean (enough) to me.
Merging directly with
merge(data, as.data.frame.matrix(tab))
gives me all the results, but the
video_id
column is between the tabular data and the original data, so not in the correct order.