-2

I am using java.util.Regex to match regex expression in a string. The string basically a html string.

Within that string I have two lines;

    <style>templates/style/color.css</style>

    <style>templates/style/style.css</style>

My requirement is to get the content inside style tag (<style>). Now I am using the pattern like;

String stylePattern = "<style>(.+?)</style>";

When I am trying to get the result using;

Pattern styleRegex = Pattern.compile(stylePattern);
Matcher matcher = styleRegex.matcher(html);
System.out.println("Matcher count : "+matcher.groupCount()+ " and "+matcher.find());                 //output 1

if(matcher.find()) {

        System.out.println("Inside find");
        for (int i = 0; i < matcher.groupCount(); i++) {
            String matchSegment = matcher.group(i);
            System.out.println(matchSegment);          //output 2
        }
    }

The result I am getting from output 1 as :

Matcher count : 1 and true

And from output 2 as;

<style>templates/style/style.css</style>

Now, I am just lost after lot of trying that how do I get both lines. I tried many other suggestion in stackoverflow itself, none worked.

I think I am doing some conceptual mistake.

Any help will be very good for me. Thanks in advance.

EDIT

I have changed code as;

Matcher matcher = styleRegex.matcher(html);
    //System.out.println("find : "+matcher.find() + "Groupcount = " +matcher.groupCount());
    //matcher.reset();
    int i = 0;
    while(matcher.find()) {

        System.out.println(matcher.group(i));
        i++;
    }

Now the result is like;

  `<style>templates/style/color.css</style>
  templates/style/style.css`

Why one with style tag and another one is without style tag?

KOUSIK MANDAL
  • 2,002
  • 1
  • 21
  • 46
  • To see all *captures*, use `for (int i = 1; i <= matcher.groupCount(); i++)`. See https://stackoverflow.com/questions/12413974/java-regex-matcher-groupcount-returns-0 and http://stackoverflow.com/questions/836704/print-regex--in-java – Wiktor Stribiżew Apr 01 '18 at 15:37
  • Didn't work for me. It does not even go inside the loop with int i=1 – KOUSIK MANDAL Apr 01 '18 at 15:40
  • You cannot call `matcher.groupCount()` before `.find()`. Also, to match multiple occurrences, you need `while`, not `if`. Remove your first `System.out.println(...)` line – Wiktor Stribiżew Apr 01 '18 at 15:43
  • I have changed as you said @WiktorStribiżew . Now can you please explain why the result is like I edited the answer. – KOUSIK MANDAL Apr 01 '18 at 15:51
  • 1
    Your regex can match only one group. Thus you need no variable i, just use group(1). group(0) is a special one which gives you the whole match regardless of any brackets, which is why you see the style tags. – Heiner Westphal Apr 01 '18 at 16:02
  • 1
    **Caution:** Parsing XML or HTML with regular expressions *will* cause problems. See https://stackoverflow.com/questions/701166/can-you-provide-some-examples-of-why-it-is-hard-to-parse-xml-and-html-with-a-reg. Solution: Use a dedicated HTML or XML parser instead. – VGR Apr 01 '18 at 17:13

2 Answers2

1

This will find all occurrences from your string.

   final Pattern pattern = Pattern.compile(regex);
   final Matcher matcher = pattern.matcher(string);
   
   while (matcher.find()) {
       
       System.out.println("Full match: " + matcher.group());
   }
SharadxDutta
  • 1,058
  • 8
  • 21
0

Can try this:

String text = "<style>templates/style/color.css</style>\n" +
            "<style>templates/style/style.css</style>";

Pattern pattern = Pattern.compile("<style>(.+?)</style>");
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
    System.out.println(text.substring(matcher.start(), matcher.end()));
}

Or:

Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
  System.out.println(matcher.group());
}
hoan
  • 1,058
  • 8
  • 13