I am looking to omit a couple of stopwords from a list of stopwords from 'English' in R.
Can anyone help me?
Thank you.
I am looking to omit a couple of stopwords from a list of stopwords from 'English' in R.
Can anyone help me?
Thank you.
easy!
good.stopwords <- setdiff( bad.stop.words, c("omit", "these", "words") )
This will remove omit
, these
and words
from your list or vector
You can use %in%
.
my_vector_of_stop_words <- c("across", "actually", "after", "afterwards", "again")
my_words_to_omit <- c("after", "again")
my_vector_of_stop_words[!my_vector_of_stop_words %in% my_words_to_omit]
Result:
[1] "across" "actually" "afterwards"
If your stopwords are in a vector, we can just subset the vector.
Here I use stopwords()
from the stopwords
package for the purpose of examples
library(stopwords)
my_stopwords <- stopwords()[1:10]
my_stopwords
[1] "i" "me" "my" "myself" "we" "our" "ours"
[8] "ourselves" "you" "your"
my_drop_words <- c("myself", "ourselves")
my_stopwords[!(my_stopwords %in% my_drop_words)]
[1] "i" "me" "my" "we" "our" "ours" "you" "your"
If you have your stopwords in a different format, or you desire your output to be different, you need to provide more details in your question.