-2

Python 2.4.4 (yeah, long story) I want to parse this fragment (with re)

    "comment":"#2 Surely, (this) can't be any [more] complicated a reg-ex?",

i.e., it (the comment) can contain characters (upper or lower), numbers, hash, parentheses, square brackets, single quotes, and commas, and it (this fragment) specifically ends with a dquote and a comma

i've gotten this far with the expression,

    r'\"comment\":\"(?P<COMMENT>[a-zA-Z0-9\s]+)\",'

but, of course, it only matches when none of the meta characters are in the comment. the final \", works as the the termination criterion. I've tried all kinds of escape, double escape ...

could a kind 're geek' please enlighten ? i want to access the "entire" comment as match.group["COMMENT"]

  1. corrected the pattern to what I was actually using when asked. my bad cut-n-paste.
  2. until marked with all the "DUPLICATES", I couldn't spell JSON. But, I DID specify I had to do this with re.
  3. even with all the JSON responses and code frags, it wasn't introduced until 2.6, and I did specify I'm still using 2.4.4.

Thanks to those responding with the regex-based solutions. Now working for me :)

  • 1
    Your current regex seems to imply that you're looking for alphabetical characters, followed by spaces, followed by digits, which does not reflect your sample input at all? Are you looking for a particular pattern there, or just whatever's inside the `"` delimiters? (in which case, use `[^"]`, or if the input can contain `\"`, repeat a group that alternates between `\"` and `[^"]`) – CertainPerformance Nov 03 '18 at 03:09
  • somthing like `re.sub(r'[()"\"",#:?",]', ' ', stringe)` – Karn Kumar Nov 03 '18 at 04:12

1 Answers1

0

Use a non-greedy .*? to match anything before ",, assuming this as the end of comment:

import re

s = '''"comment":"#2 Surely, (this) can't be any [more] complicated a reg-ex?",'''

match = re.search(r'"comment":"(?P<comment>.*?)",', s)
print(match.group('comment'))

# #2 Surely, (this) can't be any [more] complicated a reg-ex?

You can name your matched string using (?P<group_name>…).

Austin
  • 25,759
  • 4
  • 25
  • 48