2

I am trying to create a checkstyle rule where I want to prevent the use of "Company.INSTANCE.getProduct" from the below line.

private final Customer customerObj = Company.
                INSTANCE.getProduct();

I added the below module in checkstyle xml.

<module name="RegexpMultiline">
        <property name="format" value="Company[\s\n\r\R]*\.[\s\n\r\R]*INSTANCE[\s\n\r\R]*\.[\s\n\r\R]*getProduct"/>
        <property name="message" value="Do not use Company Instance."/>
    </module>

However, it does not work for multiline statements as in the above example. What am I doing wrong here? My regex works as tested in regex101.com

Zack
  • 2,078
  • 10
  • 33
  • 58
  • Try turning on multiline flag in the regexp `value="(?m)Company[\s\n\r\R]*\.[\s\n\r\R]*INSTANCE[\s\n\r\R]*\.[\s\n\r\R]*getProduct"` – LMC Sep 18 '18 at 22:09

2 Answers2

2

What am I doing wrong here?

Since you are using Java you need an escape character for the slash in each instance of the linebreak matcher \R (where R is uppercase).

So try using this regular expression instead:

Company[\s\n\r\\R]*\.[\s\n\r\\R]*INSTANCE[\s\n\r\\R]*\.[\s\n\r\\R]*getProduct

My regex works as tested in regex101.com

The regex101 web site does not support Java:

The website does not support JAVA as a flavour. The Code generator only takes your regex and puts it into a code template. It does not validate the regex for you. 

You must have been testing your regex with a different flavor such as PHP or JavaScript which masked the problem. However, there are plenty of other web sites that do support the testing of regular expressions with Java such as freeformatter and regexplanet.

If you run the regex you were providing to CheckStyle in a tester supporting Java you will get an Illegal/unsupported escape sequence error like this:

patternException

Prefixing an additional backslash to each instance of the linebreak matcher fixes this problem.

Rather than using a web site, you can also verify your regex yourself in a trivial Java program:

    String regex = "Company[\\s\\n\\r\\\\R]*\\.[\\s\\n\\r\\\\R]*INSTANCE[\\s\\n\\r\\\\R]*\\.[\\s\\n\\r\\\\R]*getProduct";
    String text = "private final Customer customerObj = Company.\n"
            + "INSTANCE.getProduct();";
    Pattern pattern = Pattern.compile(regex);
    Matcher matcher = pattern.matcher(text);
    System.out.println("find? " + matcher.find());
    System.out.println("matches? " + matcher.matches());

Note that in this case you need four backslashes before the R. See Why String.replaceAll() in java requires 4 slashes “\\” in regex to actually replace “\”? for some great explanations on why that is required.

skomisa
  • 16,436
  • 7
  • 61
  • 102
0

I find RegexpMultiline to be hard to use because it often has problems like yours. Instead, use the Regexp check, which allows for a simpler regex, and can ignore commented-out code:

<module name="Regexp">
    <property name="format" value="\bCompany\s*\.\s*INSTANCE\s*\.\s*getProduct\b"/>
    <property name="illegalPattern" value="true"/>
    <property name="ignoreComments" value="true"/>
    <message key="illegal.regexp" value="Do not use Company Instance."/>
</module>

Note the \b markers to prevent it from matching FooCompany and such. Note also that this check goes under the TreeWalker module.

barfuin
  • 16,865
  • 10
  • 85
  • 132