i have a pandas dataframe with a column with long text called description. The data from this comes from the jira web instance. I've been trying to get rid of markup in the text using several different methods but none seem to do the trick to remove \r\n\xa0.
Here's what I have so far
df['description'] = df['description'].replace(r'http\S+', '', regex=True).replace(r'www\S+', '', regex=True)
df['description'] = df['description'].replace(r'[^\x00-\x7F]+', ' ', regex = True)
df['description'] = df['description'].replace(r'\[(.+)\]\([^\)]+\)', r'\1', regex = True).replace(r'\*\*([^*]+)\*\*', r'\1', regex = True)
df['description'] = df['description'].replace(r'\*([^*]+)\*',r'\1', regex = True )
df['description'] = df['description'].astype(str).str.strip()
Any ideas what I can do here? sample of text
We analyzed found the issue in Garbage Collection which crashed the JVM.\r\n\r\n\xa0\r\n\r\n\xa0\r\n\r\n_Stack: [0x00007f0b58ff1000,0x00007f0b590f1000],\xa0 sp=0x00007f0b590ef120,\xa0 free space=1016k_\r\n\r\n_Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)_\r\n\r\n_V\xa0 [libjvm.so+0x8b9e4f]\xa0 MethodData::clean_extra_data(BoolObjectClosure)+0x1cf_\r\n\r\n_V\xa0 [libjvm.so+0x63c582]\xa0