0

from this -> contractor:"Hi, this is \"Paul\", how are you?" client:"Hi ...." <-

I want to get just -> Hi, this is \"Paul\", how are you? <-

I need a regular expression in java to do that I try it but I m struggle with the inner quotation (\") is driving me mad.

Thanks for any hint.

Arvind Kumar Avinash
  • 71,965
  • 6
  • 74
  • 110
Xenione
  • 2,174
  • 1
  • 23
  • 30
  • 2
    Do you have to use regex? You could just strip out the redundant text around the keywords by using `.indexOf()` and `.substring()` – Spectric Oct 29 '20 at 15:11
  • So briefly - you need to get the text inside a pair of double quotes `"the text to be matched"`, and this text may contain other double quotes preceded with a backslash `\"`? – Nowhere Man Oct 29 '20 at 15:20
  • @AlexRudenko . Yes Alex , I edit the question because editor doesn't skip inner quotations – Xenione Oct 29 '20 at 15:23

4 Answers4

1

Java supports lookbehinds, so vanilla regex:

"(.*?(?<!\\))"

Inside a Java string (see https://stackoverflow.com/a/37329801/1225328):

\"(.*?(?<!\\\\))\"

The actual text will be contained inside the first group of each match.

Demo: https://regex101.com/r/8OXujX/2


For example, in Java:

String regex = "\"(.*?(?<!\\\\))\"";
String input = "contractor:\"Hi, this is \\\"Paul\\\", how are you?\" client:\"Hi ....\"";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
if (matcher.find()) { // or while (matcher.find()) to iterate through all the matches
    System.out.println(matcher.group(1));
} else {
    System.out.println("No matches");
}

Prints:

Hi, this is \"Paul\", how are you?
sp00m
  • 47,968
  • 31
  • 142
  • 252
1

The regexp should be like this: "(?:\\.|[^"\\])*"

Online demo

It uses non-capturing group ?:, matching any character . or a single character NOT in the list of double quote and backslash.

Nowhere Man
  • 19,170
  • 9
  • 17
  • 42
0
var text1 = "contractor:\"Hi, this is \\\"Paul\\\", how are you?\" client:\"Hi ....\" <-";
    var regExWithQuotation = "contractor:(.+\".+\".+) client:";
    Pattern p = Pattern.compile(regExWithQuotation);

    var m = p.matcher(text1);
    ;
    if (m.find()) {
        var res = m.group(1);
        System.out.println(res);
    }

    var regExWithoutQuotation = "contractor:\"(.+\".+\".+)?\" client:";
    p = Pattern.compile(regExWithoutQuotation);
    m = p.matcher(text1);

    if (m.find()) {
        var res = m.group(1);
        System.out.println(res);
    }

Output is:

"Hi, this is "Paul", how are you?"

Hi, this is "Paul", how are you?

Bassem Adas
  • 236
  • 2
  • 9
0

You can use the regex, (?<=contractor:\").*(?=\" client:)

Description of the regex:

  1. (?<=contractor:\") specifies positive lookbehind for contractor:\"
  2. .* specifies any character
  3. (?=\" client:) specifies positive lookahead for \" client:

In short, anything preceded by contractor:\" and followed by \" client:

Demo:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Main {
    public static void main(String[] args) {
        String str = "contractor:\"Hi, this is \\\"Paul\\\", how are you?\" client:\"Hi ....\"";
        String regex = "(?<=contractor:\").*(?=\" client:)";
        Pattern pattern = Pattern.compile(regex);
        Matcher matcher = pattern.matcher(str);
        while (matcher.find()) {
            System.out.println(matcher.group());
        }
    }
}

Output:

Hi, this is \"Paul\", how are you?
Arvind Kumar Avinash
  • 71,965
  • 6
  • 74
  • 110