0

ı am trying to stemmize words in tex of dataframe

data is a dataframe , karma is text column , zargan is the dict of word and root of word

for a in range(1,100000):
    for j in data.KARMA[a].split():
        pattern = r'\b'+j+r'\b' 
        data.KARMA[a] = re.sub(pattern, str(zargan.get(j,j)),data.KARMA[a]) 
print(data.KARMA[1])

I want to change the word and root in the texts

N.K
  • 38
  • 5

1 Answers1

0

Looks like j contains some regular expression special character like *. If you want it to be interpreted as literal text, you can say

    pattern = r'\b'+re.escape(j)+r'\b'

and possibly the same for r if it should similarly be coerced into a literal string.

tripleee
  • 175,061
  • 34
  • 275
  • 318