0

I have some strings like:

\i{}Agrostis\i0{} <L.>

I would like to get rid of the '\i{}', '\io{}' characters, so that I could get just:

Agrostis <L.>

I've tried the following code (adapted from here):

m = re.search('\i{}(.+?)\i0', item_name)
if m:
   name = m.group(1).strip('\\')
else:
   name = item_name

It works in part, because when I run it I get just:

Agrostis

without the

<L.>

part (which I want to keep).

Any hints?

Thanks in advance for any assistance you can provide!

styvane
  • 59,869
  • 19
  • 150
  • 156
maurobio
  • 1,480
  • 4
  • 27
  • 38

5 Answers5

2

Use s.replace('\i{}', '') and s.replace('\io{}', '')

Moses Koledoye
  • 77,341
  • 8
  • 133
  • 139
Julien Goupy
  • 165
  • 6
1

You ca do this in different ways.

The simplest one is to use str.replace

s = '''\i{}Agrostis\i0{} <L.>'''
s2 = s.replace('''\i{}''', '').replace('''\i0{}''', '')

Another way is to use re.sub()

dimm
  • 1,792
  • 11
  • 15
1

You need to use the re.sub function.

In [34]: import re

In [35]: s = "\i{}Agrostis\i0{} <L.>"

In [36]: re.sub(r'\\i\d*{}', '', s)
Out[36]: 'Agrostis <L.>'
styvane
  • 59,869
  • 19
  • 150
  • 156
1

You could use a character class along with re.sub()

import re
regex = r'\\i[\d{}]+'
string = "\i{}Agrostis\i0{} <L.>"

string = re.sub(regex, '', string)
print string

See a demo on ideone.com.

Jan
  • 42,290
  • 8
  • 54
  • 79
0

You can either use s.replace('\i{}', '') and s.replace('\io{}', ''), as Julien said, or, continuing with the regex approach, change your pattern to:

re.search('\i{}(.+?)\i0(.++)', item_name)

And use m.group(1).strip('\\') + m.group(2).strip('\\') as the result.

Amit Gold
  • 727
  • 7
  • 22