Regular Expression: Why do I get no match found

Question

I am trying to parse a document that consists of many sections.

Each section begins with :[]: followed by blank space, followed by 1 or more characters (any characters), followed by a : a blank space and one or more characters (any characters).

Here's an example:

:[]: Abet1, Abetted34: Find the usage in table under section 1-CB-45: Or more info from the related section starting with PARTIE-DU-CORPS.
:[]: Ou est-ce que tu a mal: Tu as mal aux jambes: Find usage in section 145-TT-LA-TETE.

The token of interest from each section is everything from :[]: to the first occurrence of :. For example, in the first section, I am only interested in extracting: :[]: Abet1, Abetted34:

At first, I used the following pattern finder to extract the token from each section of the document but this extracted everything from the first occurrence of : to the last occurrence of : in the section:

"\\B:\\[\\]:.*:\\B"

If I change the pattern finder to the following to extract the token from :[]: to the first occurrence of :, I get no match:

"\\B:\\[\\]:\\s*.:{1}"

How would the regular expression that extracts what I want look like?

When you say that `:[]: _` (underscore is a space) should be followed by *any* character until the first `:`, you're negating yourself. Clearly, *any* character won't do since `:` is also a character. — Janez Kuhar, Oct 09 '20 at 15:27
That's correct the ':' is also considered any character but I have tried so many variations and not sure how to exclude ':' from any characters. — Darvin, Oct 09 '20 at 15:39

VietDD · Accepted Answer · 2020-10-09T15:59:21.033

3

This is what you want?

See more : https://regex101.com/r/jOmnSb/2

Or

See more : https://regex101.com/r/jOmnSb/3

UPDATE :

You can convert regex to Java regex here : https://www.regexplanet.com/advanced/java/index.html

edited Oct 09 '20 at 15:59

answered Oct 09 '20 at 15:42

VietDD

1,048
2
12
12

Java regexes need an extra escaping though – m0skit0 Oct 09 '20 at 15:56
yeah, I found a site to convert regex to java regex : https://www.regexplanet.com/advanced/java/index.html – VietDD Oct 09 '20 at 15:58
Thanks, the 2nd pattern is the one I want. The first one drops the ":" after abetted – Darvin Oct 09 '20 at 16:24

Janez Kuhar · Answer 2 · 2020-10-18T11:52:32.847

So you want to match a string against:

:[]:_ (where _ is a space character)
followed by one or more characters that are not a : (refer to this question)
close the match with a : character

The regex for that would be:

:\[\]: [^:]+:

You have to escape \ characters when converting the regex pattern to Java. You could do something like:

import java.util.regex.*; 
public class MatchTest {
    public static void main(String[] args) {
        Pattern pattern = Pattern.compile(":\\[\\]: [^:]+:", Pattern.CASE_INSENSITIVE);
        Matcher matcher =
            pattern.matcher(
                ":[]: Abet1, Abetted34: Find the usage in table under section 1-CB-45: Or more info from the related section starting with PARTIE-DU-CORPS.\n"
              + ":[]: Ou est-ce que tu a mal: Tu as mal aux jambes: Find usage in section 145-TT-LA-TETE."
            );
        while (matcher.find()) {
            System.out.println(matcher.group());
        }
    }
}

Thanks, this is what I wanted. When I tried this yesterday, instead of [^:]*:", I was using [^:].*:", and didn't know that '*' can be used alone without the '.' — Darvin, Oct 09 '20 at 16:12

Regular Expression: Why do I get no match found

2 Answers2