2
data = re.sub(r'.*"md5": ', '', data, 1)

I am trying to use this in the first occurrence of a text file but it is starting from the last occurrence, executing backwards. So, when I do:

r = re.match('(.*?)}{"id": "', data).group()

It returns AttributeError: 'NoneType' object has no attribute 'group' because id always appears before md5.

There is a way to revert this?

Eduardo Andrade
  • 111
  • 1
  • 1
  • 13
  • 1
    Looks like you are parsing JSON with regex. There is a json parsing module, you do not have to use a regex that is not safe for JSON parsing. – Wiktor Stribiżew Nov 02 '17 at 18:59
  • Thank you but probably I will need this in different files (not json). I am going to use json parsing right now but I would like to clarify this re.sub. – Eduardo Andrade Nov 02 '17 at 19:09
  • Could you explain what this question is about? It is rather unclear right now without a sample text and expected output. You cannot *revert* a `re.sub` operation. Nor change the direction the `re` engine matches in a string (though you may with `regex`). – Wiktor Stribiżew Nov 02 '17 at 19:21
  • If you need to match something that can be removed with the `re.sub`, run `re.search` before running `re.sub`. – Wiktor Stribiżew Nov 02 '17 at 19:28
  • Humm so there is no way to revert re.sub operation. Ok, I am going to try re.search. I was creating an example to show the expected output but I will try re.search before. – Eduardo Andrade Nov 02 '17 at 19:33
  • The point is that if you remove a part of a string that contained the expected match, you will not find that match - isn't that logical? – Wiktor Stribiżew Nov 02 '17 at 19:37
  • It is okay, before I have used "replace" with success but now I need something like "re.sub". The problem is in the order of "re.sub" operation. – Eduardo Andrade Nov 02 '17 at 19:48
  • Done! "data = data.replace(re.search(r'(.*?)"md5": "', data).group(), "", 1)" solved my problem. – Eduardo Andrade Nov 02 '17 at 20:49
  • You seem to need all text after the first occurrence of a `"md5": "` substring. Why not use [`re.search(r'"md5":\s*"(.*)', data, re.S)`](https://ideone.com/sVGPHA) and grab `m.group(1)` value? – Wiktor Stribiżew Nov 02 '17 at 21:06

0 Answers0