0

I have a dataframe with about 10,000 words in one column and their corresponding frequencies in another. I also have a vector with about 600 words. Each of the 600 words is a word in the data frame. How do I look up the frequencies for the 600-word vector from the 10,000 word data frame?

Namenlos
  • 475
  • 5
  • 17

2 Answers2

0

One of the many solutions, with df$words being the column of your data.frame with the words and wordsvector being the vector:

library(plyr)
freqwords <- ddply(df, .(words), summarize, n = length(words)) #shows frequency of all the words in the data.frame
freqwords[freqwords$words %in% wordsvector,] #keeping only the words that appear in your vector

Next time it would be helpful if you provide some dummy data so we can help you better.

user3640617
  • 1,546
  • 13
  • 21
0

use dplyr's join functions.

# make the 600 vector into a dataframe
600_df <- as.data.frame(600_vec)

# left join the two dataframes
df <- left_join(x = 600_df, y = 10000_df, by = "word")

where the "word" is the variable name constant between the two dataframes

sweetmusicality
  • 937
  • 1
  • 10
  • 27