0

I am here new. I used this forum a lot for questions, most of the time i can solve my problems them with other questions but not this time. I have two dataframes with cells, one dataframe (df1) with newsarticles and one (df2) with company names.

df1$articles: news articles in every cell, one column whole text, 
df1$tags: only the tags of the article
df2$names: company names

i want to see whether those company names occur in the news articles dataset and if yes that it could give a TRUE/FALSE or 0/1 variable.

I tried: identical (df1$tags,df2$names) but it gives a FALSE indication, but it should give TRUE for some values.

I also tried a fore-loop for this:

for(i in 1:length(df1$tags))
{
  #for(j in 1:length(df2$names))
  #{
    if(identical(df1$tags$tags[i],df2$names[j]))
    {
      print("i found something")
    }
  }
}

So someone that could help me out? much appreciated!

Example:
df1$article:  body of the article e.g. Nederlanders kunnen weer vanaf 1 maart tot 1 mei aan.....
df1$tags: tags of the article e.g. Philips
df2$names: here i have Philips as one of the company names

See whether those company names in df2$names occur in df1$articles or df1$tags

photo df1 photo df2

Final Dput example data

    structure(list(id = 1:2, body = structure(1:2, .Label = c("Dinsdag werd bekend dat de Euroland door de", 
"Ieder jaar verandert er wel iets in de belastingaangifte"), class = "factor"), 
    tags = structure(1:2, .Label = c("Belastingaangifte", "Euroland"
    ), class = "factor")), .Names = c("id", "body", "tags"), row.names = c(NA, 
-2L), class = "data.frame")

     structure(list(id = 1:2, names = structure(c(1L, 1L), .Label = "Belastingaangifte", class = "factor"), 
        names1 = structure(c(1L, 1L), class = "factor", .Label = "Belastingaangifte")), .Names = c("id", 
    "names", "names1"), row.names = c(NA, -2L), class = "data.frame")
Valentino
  • 1
  • 1
  • Can you give some example data and the desired output? – Edwin May 02 '17 at 11:55
  • 1
    Do not use the [rstudio] tag for general R questions. – Spacedman May 02 '17 at 12:03
  • @Edwin i tried to include a example, hope this works – Valentino May 02 '17 at 12:17
  • Follow these two links: [Minimal, Complete and Verifiable Example](https://stackoverflow.com/help/mcve) and [Great R reproducible Example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – Sotos May 02 '17 at 12:21
  • Your example is not reproducible. You need to provide an extract of your data 'eg with `dput`). See [here](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Maybe `grep` is what you need. – Gilles San Martin May 02 '17 at 12:21
  • @Gilles i am figuring out how to do that, but in meantime i also included 2 photos of the two dataframes and their structures. – Valentino May 02 '17 at 12:39
  • I used dput, i think this is now reproducible. – Valentino May 02 '17 at 13:07
  • With the data provided this will work : `df1$tags %in% df2$names` but this will probably not work with the rest of your data (as seen from the pictures). – Gilles San Martin May 02 '17 at 13:28
  • @Gilles thank you, i tried that earlier on before those foreloops and faced the same problem you was thinking of. it dont run through the whole dataframe and gets me false for every one whereas there should be some true – Valentino May 02 '17 at 20:20

0 Answers0