2

I want to:

When I get a big string I need to find stuff in it using a regular expression in Java to separate using the following formulas:

  • If a line (after \n) has over 1000 characters, check if the 1000th character is enclosed in after an odd '.
  • Then add a concat string '\n||' between the 1000 and 1001 characters
  • If 1000 and 1001 characters are '' (escape character for plsql) then insert it between 1001 and 1002

Anyway I made this regular expression:

"\n(?<kiloCharacters>[^\n]{1000})(?<=(?<newLine>\n)(?<pairsAndText>[^'\n]{0,1001}|[^\n']{0,1001}'[^\n']{0,1001}'[^\n']{0,1001}){0,1001}(?<oddComa>')(?<text>[^\n']{0,1001}))(?(?<=')(?!'))"

Let me explain it:

"\n(?<kiloCharacters>[^\n]{1000}) --> Newline and 1000 characters
(?<= --> Let's look behind to check if we have an odd number of '
  (?<newLine>\n) --> Start from new line
  (?<pairsAndText> --> All pairs of '
    [^'\n]{0,1001} --> Eighter 0 '
    | --> or
    [^\n']{0,1001}'[^\n']{0,1001}'[^\n']{0,1001}){0,1001} --> (text* ' text* ' text* )*
  (?<oddComa>') --> Last comma
  (?<text>[^\n']{0,1001}) --> Text after that comma
) --> End of actual looking behind
(?(?<=')(?!'))" --> This part check if we are inside an escaped character '' as we can not concat stuff between here

Anyway, it seems I get the folowing error.

Exception in thread "main" java.util.regex.PatternSyntaxException: Look-behind group does not have an obvious maximum length near index 161

  (?<kiloCharacters>[^
  ]{1000})(?<=(?<newLine>
  )(?<pairsAndText>[^'
  ]{0,1001}|[^
  ']{0,1001}'[^
  ']{0,1001}'[^
  ']{0,1001}){0,1001}(?<oddComa>')(?<text>[^
  ']{0,1001}))(?(?<=')(?!'))
                                                                                                                                                                   ^
      at java.util.regex.Pattern.error(Unknown Source)
      at java.util.regex.Pattern.group0(Unknown Source)
      at java.util.regex.Pattern.sequence(Unknown Source)
      at java.util.regex.Pattern.expr(Unknown Source)
      at java.util.regex.Pattern.compile(Unknown Source)
      at java.util.regex.Pattern.<init>(Unknown Source)
      at java.util.regex.Pattern.compile(Unknown Source)
      at java.lang.String.replaceAll(Unknown Source)

Why does it do that? Did I not make the limitation by using {0,1001} instead of *?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
user2586356
  • 39
  • 1
  • 4

1 Answers1

1

Java's regex engine does not support variable length look behind. That means that when the length of the look behind is not fixed the engine will throw this exception. Your look behind's length is variable, thus you get this exception.

Java regex error - Look-behind group does not have an obvious maximum length

Community
  • 1
  • 1
Lodewijk Bogaards
  • 19,777
  • 3
  • 28
  • 52