I am using en_core_web_lg to compare some texts for similarity and I am not getting the expected results.
The issue I guess is that my texts are mostly religious, for example: "Thus hath it been decreed by Him Who is the Source of Divine inspiration." "He, verily, is the Expounder, the Wise." "Whoso layeth claim to a Revelation direct from God, ere the expiration of a full thousand years, such a man is assuredly a lying impostor. "
My question is, is there a way I can check spacy's "dictionary"? Does it include words like "whoso" "layeth" "decreed" or "verily"?