i'm currently searching for an easy solution to add custom stopwords to spacy. These stopwords shall be determined on basis of the absolute frequency of the words in the whole corpus. E.g., in my domain-specific texts, the term "patient" should be considered a stopword as it occurs in 70% of all documents.
My first idea was to implement this by the help of pandas apply, but this would require to write my own tokenizing function. Is there a possibility to customize Spacy?
Thank you for any advice