-3

I am looking to omit a couple of stopwords from a list of stopwords from 'English' in R.

Can anyone help me?

Thank you.

  • 2
    `x[!x %in% not_wanted]` – IRTFM Mar 28 '21 at 23:20
  • 1
    Please [make this question reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) by including example data in a plain text format. In particular, "list" has a specific meaning in R, so it would help to see what your data looks like. – neilfws Mar 28 '21 at 23:49

3 Answers3

1

easy!


good.stopwords <- setdiff( bad.stop.words, c("omit", "these", "words") )

This will remove omit , these and words from your list or vector

Sirius
  • 5,224
  • 2
  • 14
  • 21
1

You can use %in%.

my_vector_of_stop_words <- c("across", "actually", "after", "afterwards", "again")

my_words_to_omit <- c("after", "again")

my_vector_of_stop_words[!my_vector_of_stop_words %in% my_words_to_omit]

Result:

[1] "across"     "actually"   "afterwards"
neilfws
  • 32,751
  • 5
  • 50
  • 63
  • I haven't used the vector, so couldn't use this solution. Thanks a lot anyway! – novice_programmer Mar 28 '21 at 23:34
  • So what have you used? Show us example data in your question. A column of a data frame is also a vector, if that's the issue. Or is it a list as in an R list? – neilfws Mar 28 '21 at 23:38
1

If your stopwords are in a vector, we can just subset the vector.

Here I use stopwords() from the stopwords package for the purpose of examples

library(stopwords)
my_stopwords <- stopwords()[1:10]
my_stopwords
 [1] "i"         "me"        "my"        "myself"    "we"        "our"       "ours"      
 [8] "ourselves" "you"       "your" 

my_drop_words <- c("myself", "ourselves")
my_stopwords[!(my_stopwords %in% my_drop_words)]
[1] "i"    "me"   "my"   "we"   "our"  "ours" "you"  "your"

If you have your stopwords in a different format, or you desire your output to be different, you need to provide more details in your question.

Ben Norris
  • 5,639
  • 2
  • 6
  • 15