0

I've a string like below

<a name="1.F"></a>

I'm trying to extract only the value name value. And the regex I'm using to get this done is.

Search string

<a name=\"(\d)+(\.)(\w)+\"?>

Replace String

$1$2$3

Result: Here the Result that I get is what is in between **

search result :

**<a name="1.F">**</a>

Current Replace Result:

1.F</a>

Expected Replace Result:

1.F

please let me know, how can we get the Expected result.

Thanks

user3872094
  • 3,269
  • 8
  • 33
  • 71
  • 1
    Use `[^"]` instead of `\w`? Also, you might want the quantifiers inside the groups. – Andy Turner Nov 23 '16 at 09:19
  • Your `(\w)+` is matching till the end. Use `[^"]` – Anshul Rai Nov 23 '16 at 09:20
  • 1
    Is the string always in this format? Do you really need to pre-validate it with `\d+\.\w+`? Note that all you need to extract a string between double quotes is a couple of string methods. Splitting with `"` and getting the second item could also be an option. – Wiktor Stribiżew Nov 23 '16 at 09:22

2 Answers2

3

Given your example, why don't you go for a simple solution? Like: your value is enclosed by double quotes - so just use that knowledge and capture everything between those two "quotes".

import java.util.regex.*;

public class RegexTest {

  public static void main(String args[]) {
    String s = "<a name=\"1.F\"></a>";
    Pattern p = Pattern.compile("<a name=\\\"(.*)\\\"");
    Matcher m = p.matcher(s);
    if (m.find()) {
      System.out.println(m.group(1));
    }
  }
}   

prints:

1.F

You see, you wouldn't even need a regex for that, you could use String.indexOf() in order to quickly fetch the position of that quotation mark.

If on the other hand, your example was just simplified, and your real input is much complicated (like in: real HTML or XML) ... then simply forget about using regexes in the first place (see here for why that is).

Community
  • 1
  • 1
GhostCat
  • 137,827
  • 25
  • 176
  • 248
  • Hi @GhostCat, this doesn't seem working since the `.*` will be captirung till ``, moreover I've posted this after verifying this. :) – user3872094 Nov 23 '16 at 09:33
  • 1
    @GhostCat, your regex would capture everything between `"` so it was correct. The result was in the first capture group. – AxelH Nov 23 '16 at 09:40
1

UPDATED

@user3872094 I have just test my code by yours pattern <a name=\"(\d)+(\.)(\w)+\"?> and it gives output 1.F. So I suspect the bug comes from your Java code

    String s = "**<a name=\"1.F\">**</a>";
    Pattern p = Pattern.compile("<a name=\"(\\d)+(\\.)(\\w)+\"?>");
    Matcher m = p.matcher(s);
    while (m.find()) {
        System.out.println(m.group(1) + m.group(2) + m.group(3));
    }

=>> 1.F

Antony Dao
  • 425
  • 2
  • 14