3

I am struggling with the following issue: say there's a regex 1 and there's regex 2 which should match everything the regex 1 does not.

Let's have the regex 1: /\$\d+/ (i.e. the dollar sign followed by any amount of digits.

Having a string like foo$12___bar___$34wilma buzz it detects $12 and $34.

How does the regex 2 should look in order to match the remained parts of the aforementioned string, i.e. foo, ___bar___ and wilma buzz? In other words it should pick up all the "remained" chunks of the source string.

varnie
  • 2,523
  • 3
  • 35
  • 42

3 Answers3

4

You may use String#split to split on given regex and get remaining substrings in an array:

String[] arr = str.split( "\\$\\d+" );

//=> ["foo", "___bar___", "wilma buzz"]

RegEx Demo

anubhava
  • 761,203
  • 64
  • 569
  • 643
  • Thanks, but I specifically interested in the regex because I am dealing with some kind of "declarative" algorithm therefore I cannot use this approach. – varnie Aug 22 '19 at 18:37
  • 1
    This is very much regex approach itself. – anubhava Aug 22 '19 at 18:50
0

It was tricky to get this working, but this regex will match everything besides \$\d+ for you. EDIT: no longer erroneously matches $44$444 or similar.

(?!\$\d+)(.+?)\$\d+|\$\d+|(?!\$\d+)(.+)

Breakdown


(?!\$\d+)(.+?)\$\d+
(?!     )                   negative lookahead: assert the following string does not match
   \$\d+                    your pattern - can be replaced with another pattern
         (.+?)              match at least one symbol, as few as possible
              \$\d+         non-capturing match your pattern

OR

\$\d+                       non-capturing group: matches one instance of your pattern

OR

(?!\$\d+)(.+)
(?!\$\d+)                   negative lookahead to not match your pattern
         (.+)              match at least one symbol, as few as possible


GENERIC FORM
(?!<pattern>)(.+?)<pattern>|<pattern>|(?!<pattern>)(.+)

By replacing <pattern>, you can match anything that doesn't match your pattern. Here's one that matches your pattern, and here's an example of arbitrary pattern (un)matching.

Good luck!

Nick Reed
  • 4,989
  • 4
  • 17
  • 37
  • 2
    Actually, it is a [wrong approach](https://regex101.com/r/gOY1wF/1). The digits get matched in the consecutive `\$\d+` patterns. – Wiktor Stribiżew Aug 22 '19 at 19:36
  • Good catch - I'll modify the regex. – Nick Reed Aug 22 '19 at 19:53
  • 1
    You can't do any better in Java than what [Thefourthbird suggested](https://stackoverflow.com/questions/57615221/regex-match-everything-the-other-regex-left/57615989?noredirect=1#comment101685568_57615221). Actually, in Java, you can't write a regex that matches any text other than some multicharacter string. With capturing, something can be done, but it is inefficient, I would not recommend this. – Wiktor Stribiżew Aug 22 '19 at 19:55
  • I was playing around with it and arrived at The fourth bird's comment suggestion myself. Looks like there's no better way to do this. – Nick Reed Aug 22 '19 at 19:59
  • @WiktorStribiżew I take it back, I got one that's *barely* more efficient than Thefourthbird's - posting it now. – Nick Reed Aug 22 '19 at 20:07
  • 2
    Ok, so the capturing approach. Note you do not need all the non-capturing groups here, you may remove them. The pattern is very verbose though, and note that `.` does not match newline by default. Once the text between two matching patterns is very large, Java regex engine may halt, and you will need to unroll the lazy dot. So, as I say, it is nice from an educational point of view, but rather impractical and fragile. – Wiktor Stribiżew Aug 22 '19 at 20:28
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/198333/discussion-between-nick-reed-and-wiktor-stribizew). – Nick Reed Aug 22 '19 at 20:33
-2

Try this one

[a-zA-Z_]+ 

Or even better

[^\$\d]+ -> With the ^symbol you can negotiate the search like ! in the java -> not equal
Lieven Keersmaekers
  • 57,207
  • 13
  • 112
  • 146
s.3.valkov
  • 75
  • 3