How to omit a specific stopword from a list of stopwords in R?

Question

I am looking to omit a couple of stopwords from a list of stopwords from 'English' in R.

Can anyone help me?

Thank you.

Please [make this question reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) by including example data in a plain text format. In particular, "list" has a specific meaning in R, so it would help to see what your data looks like. — neilfws, Mar 28 '21 at 23:49

score 1 · Answer 1 · answered Mar 28 '21 at 23:20

1

easy!


good.stopwords <- setdiff( bad.stop.words, c("omit", "these", "words") )

This will remove omit , these and words from your list or vector

answered Mar 28 '21 at 23:20

Sirius

5,224
2
14
21

Thanks so much. It was much easier. – novice_programmer Mar 28 '21 at 23:33

score 1 · Answer 2 · answered Mar 28 '21 at 23:21

1

You can use %in%.

my_vector_of_stop_words <- c("across", "actually", "after", "afterwards", "again")

my_words_to_omit <- c("after", "again")

my_vector_of_stop_words[!my_vector_of_stop_words %in% my_words_to_omit]

Result:

[1] "across"     "actually"   "afterwards"

answered Mar 28 '21 at 23:21

neilfws

32,751
5
50
63

I haven't used the vector, so couldn't use this solution. Thanks a lot anyway! – novice_programmer Mar 28 '21 at 23:34
So what have you used? Show us example data in your question. A column of a data frame is also a vector, if that's the issue. Or is it a list as in an R list? – neilfws Mar 28 '21 at 23:38

score 1 · Answer 3 · answered Mar 28 '21 at 23:24

If your stopwords are in a vector, we can just subset the vector.

Here I use stopwords() from the stopwords package for the purpose of examples

library(stopwords)
my_stopwords <- stopwords()[1:10]
my_stopwords
 [1] "i"         "me"        "my"        "myself"    "we"        "our"       "ours"      
 [8] "ourselves" "you"       "your" 

my_drop_words <- c("myself", "ourselves")
my_stopwords[!(my_stopwords %in% my_drop_words)]
[1] "i"    "me"   "my"   "we"   "our"  "ours" "you"  "your"

If you have your stopwords in a different format, or you desire your output to be different, you need to provide more details in your question.

I haven't used the vector, so couldn't use this solution. Thanks a lot anyway! — novice_programmer, Mar 28 '21 at 23:34

How to omit a specific stopword from a list of stopwords in R?

3 Answers3