the dataframe df.freq
below is full of words and their properties (e.g. frequency, length, etc).
df.freq
'data.frame': 221324 obs. of 7 variables:
$ Word : Factor w/ 221324 levels "a","aa-class",..: 195399 6167 198867 90289 1 131901 91600 95885 195346 95685 ...
$ BlogFreqPm : num 48737 28649 27965 23737 23630 ...
$ TwitterFreqPm: num 30241 14145 25420 29598 19788 ...
$ NewsFreqPm : num 56009 25139 25590 5516 25291 ...
$ CumFreqPm : num 134987 67932 78975 58851 68709 ...
$ LogCumFreq : num 11.8 11.1 11.3 11 11.1 ...
$ Length : int 3 3 2 1 1 2 2 2 4 2 ...
I need to merge
the columns LogCumFreq
and Length
in the dataframe above with the dataframe df.words
below.
df.words
Classes ‘grouped_df’, ‘tbl_df’, ‘tbl’ and 'data.frame':
$ target : chr "HAT" "DEPART" "MUD" "LUST" ...
$ prime : chr "hat" "department" "muddy" "luster" ...
...
What I'd need to do is to apply merge
so that the variables LogCumFreq
and Length
in df.freq
are inserted for each row in two different columns, each of which contains the values for the prime
and the target
, respectively.
I've tried to use merge
for prime
first and then target
, but since the two values are always on the same row, they overwrite each other. Does anybody know how to do this?
EDIT:
The dput
example of the dataframes are below.
df.words <-
structure(list(prime = structure(c(2L, 1L, 5L, 4L, 3L), .Label = c("department",
"hat", "hunter", "luster", "muddy"), class = "factor"), target = structure(c(2L,
1L, 4L, 3L, 5L), .Label = c("DEPART", "HAT", "LUST", "MUD",
"SPY"), class = "factor")), class = "data.frame", row.names = c(NA,
-5L))
df.freq <-
structure(list(word = structure(c(3L, 2L, 8L, 6L, 4L, 1L, 7L,
5L, 9L), .Label = c("depart", "department", "hat", "hunter",
"lust", "luster", "mud", "muddy", "spy"), class = "factor"),
freq = c(4.3, 5.323, 9.9, 2, 0.56, 4.5, 6.99, 10.88, 7),
length = c(3L, 10L, 5L, 6L, 6L, 6L, 3L, 4L, 3L)), row.names = c(NA,
-9L), class = "data.frame")
The following is an example of the desired output:
df.words.freq <-
structure(list(prime = structure(c(2L, 1L, 5L, 4L, 3L), .Label = c("department",
"hat", "hunter", "luster", "muddy"), class = "factor"), target = structure(c(2L,
1L, 4L, 3L, 5L), .Label = c("DEPART", "HAT", "LUST", "MUDDY",
"SPY"), class = "factor"), freq.prime = c(4.3, 5.323, 9.9, 2,
0.56), freq.target = c(4.3, 4.5, 6.99, 10.88, 7), length.prime = c(3,
10, 5, 6, 6), length.target = c(3, 6, 3, 4, 3)), row.names = c(NA,
-5L), class = "data.frame")