1

I am trying to remove a repeated pattern or text using regex. The input text is:

RuleSet:[text], Data:[{text}{text}...] RuleSet:[text], Data:[{text},{text},....] SomeText RuleSet:[{text}...], Data:[{text}...]

Where substring can be any alphanumeric word and can contain special characters as well spaces. I am trying to remove any of the following:

  • RuleSet:[text],
  • Data:[{text}{text}...]

I'd like to retain the following SomeText

I have tried many ways of doing it, but I can't seem to get the desired result.

Ro Yo Mi
  • 14,790
  • 5
  • 35
  • 43

1 Answers1

2

Description

\s?(?:RuleSet|Data):\[[^]]*](?:,?\s|$)

Replace with: nothing

Regular expression visualization

This regular expression will do the following:

  • find substrings that look like RuleSet:[text], or Data:[{text}{text}...]
  • allow you to replace these with anything, or in this case nothing

Example

Live Demo

https://regex101.com/r/eJ2aB5/1

Sample text

RuleSet:[text], Data:[{text}{text}...] RuleSet:[text], Data:[{text},{text},....] SomeText RuleSet:[{text}...], Data:[{text}...]

After Replace

SomeText

Explanation

NODE                     EXPLANATION
----------------------------------------------------------------------
  \s?                      whitespace (\n, \r, \t, \f, and " ")
                           (optional (matching the most amount
                           possible))
----------------------------------------------------------------------
  (?:                      group, but do not capture:
----------------------------------------------------------------------
    RuleSet                  'RuleSet'
----------------------------------------------------------------------
   |                        OR
----------------------------------------------------------------------
    Data                     'Data'
----------------------------------------------------------------------
  )                        end of grouping
----------------------------------------------------------------------
  :                        ':'
----------------------------------------------------------------------
  \[                       '['
----------------------------------------------------------------------
  [^]]*                    any character except: ']' (0 or more times
                           (matching the most amount possible))
----------------------------------------------------------------------
  ]                        ']'
----------------------------------------------------------------------
  (?:                      group, but do not capture:
----------------------------------------------------------------------
    ,?                       ',' (optional (matching the most amount
                             possible))
----------------------------------------------------------------------
    \s                       whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
   |                        OR
----------------------------------------------------------------------
    $                        before an optional \n, and the end of a
                             "line"
----------------------------------------------------------------------
  )                        end of grouping
----------------------------------------------------------------------
Ro Yo Mi
  • 14,790
  • 5
  • 35
  • 43
  • Thanks @Ro Yo Mi, It did work for me and with better response time. I did get the expected results using "RuleSet\:\[(\s*?.*?)*?\]\,\s|Data\:\[(\s*?.*?)*?\]|" – S Madichetti May 28 '16 at 03:08
  • I don't like using `.*?` assertion when working with values with wrapped characters like brackets, parenthesis, or quotes as it tends to lead to unexpected results. But if it works for you then that's awesome. – Ro Yo Mi May 28 '16 at 03:18