5

I have a string similar to "dasdasdsafs[image : image name : image]vvfd gvdfvg dfvgd". From this string, I want to remove the part which stars from [image : and ends at : image] . I tried to find the 'sub-string' using following code-

result = re.search('%s(.*)%s' % (start, end), st).group(1)

but it doesn't give me the required result. Help me to find the correct way to remove the sub-string from the string.

malhar
  • 562
  • 1
  • 9
  • 21
n.imp
  • 787
  • 4
  • 11
  • 29
  • 1
    Take a look at [removing a substring](https://stackoverflow.com/questions/8703017/remove-sub-string-by-using-python) or [substring in python](https://stackoverflow.com/questions/663171/is-there-a-way-to-substring-a-string-in-python?rq=1) then [find the index of a character in a string](https://stackoverflow.com/questions/2294493/how-to-get-the-position-of-a-character-in-python) – LinkBerest Aug 14 '15 at 17:23

4 Answers4

8

You can use re.sub :

>>> s='dasdasdsafs[image : image name : image]vvfd gvdfvg dfvgd'
>>> re.sub(r'\[image.+image\]','',s)
'dasdasdsafsvvfd gvdfvg dfvgd'
Mazdak
  • 105,000
  • 18
  • 159
  • 188
  • May I suggest automatically fixing regex escaping, and preventing greedy matching, which could result in removing large quantities of text you don't want removed: pattern = '%s(.*?)%s' % (re.escape(start), re.escape(end)) answer = re.sub(pattern, '', st) – Kenny Ostrom Aug 14 '15 at 17:49
  • @KennyOstrom Yeah Good job! that's more general. – Mazdak Aug 14 '15 at 18:03
  • This solution doesn't work if there are multiple occurrences of the substring. For example in "%name1% likes %name2% the solution needs to return "name1" and "name2". Instead it returns "name1% likes %name2" – Alexei Masterov Nov 24 '20 at 17:01
5

The obvious problem will be that you can't just plug in an arbitrary string, because it will contain characters that will change how re looks at it. Instead, you want to escape your start and end strings. Of course, you could just fix them manuall by typing in the correct escape code this time, but it would be better if there were an easy way to have the python library do it for you, and handle any values.

import re
start = re.escape("[image : ")
end   = re.escape(" : image]")
st = "dasdasdsafs[image : image name : image]vvfd gvdfvg dfvgd"
result = re.search('%s(.*)%s' % (start, end), st).group(1)
print result
Kenny Ostrom
  • 5,639
  • 2
  • 21
  • 30
  • Sorry, I misunderstood the question. He wants the text not in this match. (although escaping is still a good idea, even then) – Kenny Ostrom Aug 14 '15 at 17:51
2

You probably just need to escape the square brackets since those are special characters in regex (i.e, start = r"\[image :" and end = r": image\]").

Randy
  • 14,349
  • 2
  • 36
  • 42
  • Could you please give a complete example? i tried this `result = re.search('%s(.*)%s' %(start, end),st).group(1)` and it returns me the string between from the start and end. But i need to remove from start to end. – n.imp Aug 14 '15 at 17:22
2

This will remove all occurrences in a string

import re

s = "dasdasdsafs[image : image name : image]vvfd gvdfvg dfvgd"
s = re.sub(r'\[image :.*?: image\]', r'', s)
Cody Bouche
  • 945
  • 5
  • 10