0

I've this piece of html code. I want to replace the link placeholders for the content mentioned in three separate attributes. This is what I've tried so far:

    String texto2 = "url(\"primeiro url\")\n" +
    "url('2 url')\n" +
    "href=\"1 href\"\n" +
    "src=\"1 src\"\n" +
    "src='2 src'\n" +
    "url('3 url')\n" +
    "\n" +
    ".camera_target_content .camera_link {\n" +
    "   background: url(../images/blank.gif);\n" +
    "   display: block;\n" +
    "   height: 100%;\n" +
    "   text-decoration: none;\n" +
    "}";

    String exp = "(?:href|src)=[\"'](.+)[\"']+|(?:url)\\([\"']*(.*)[\"']*\\)";
    // expressão para pegar os links do src e do href
    Pattern pattern = Pattern.compile(exp);

    // preparando expressao
    Matcher matcher = pattern.matcher(texto2); 


    // pegando urls e guardando na lista
    while(matcher.find()) {


    System.out.println(texto2.substring(matcher.start(), matcher.end()));   
    }

So far, so good - It works with find just that I need to get the clean link, something like this:

  img/image.gif

and not:

 href = "img/image.gif"

     src = "img/image.gif"      url (img/image.gif)

I want to replace one placeholder using one variable; this is what I've tried so far:

        String texto2 = "url(\"primeiro url\")\n" +
    "url('2 url')\n" +
    "href=\"1 href\"\n" +
    "src=\"1 src\"\n" +
    "src='2 src'\n" +
    "url('3 url')\n" +
    "\n" +
    ".camera_target_content .camera_link {\n" +
    "   background: url(../images/blank.gif);\n" +
    "   display: block;\n" +
    "   height: 100%;\n" +
    "   text-decoration: none;\n" +
    "}";

    String exp = "(?:href|src)=[\"'](.+)[\"']+|(?:url)\\([\"']*(.*)[\"']*\\)";
    // expressão para pegar os links do src e do href
    Pattern pattern = Pattern.compile(exp);

    // preparando expressao
    Matcher matcher = pattern.matcher(texto2); 


    // pegando urls e guardando na lista
    while(matcher.find()) {


    String s = matcher.group(2);
    System.out.println(s);  


    }

It turns out that this version does not work. It grabs the url perfectly; can someone help me spot the problem?

OnoSendai
  • 3,960
  • 2
  • 22
  • 46
deFreitas
  • 4,196
  • 2
  • 33
  • 43

1 Answers1

0

Use jsoup. Parse the HTML string into a DOM and you can then use CSS selectors to pull out the values as you would with jQuery in JavaScript. Note that this will only work if you're actually working with HTML; the string at the top of your example is not HTML.

asthasr
  • 9,125
  • 1
  • 29
  • 43