1

Given a sample HTML below. How can I get the image link http://lis.deped.gov.ph/uis/assets/rev/2630813/images/deped-logo.gif? Just in case the image link starts in//like//uis/assets/rev/2630813/images/deped-logo.gif` I will just have to add a string before it.

How to do it with regex? I do not want to use http library.

 <div class="navbar-header"><button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#deped-uis-nav-collapse"><span class="sr-only">Toggle navigation</span><span class="icon-bar"></span><span class="icon-bar"></span><span class="icon-bar"></span></button><span class="navbar-brand"><img class="logo" src="https://i.stack.imgur.com/P7HKA.gif" alt="DepEd" style="height: 20px; margin-top: -2px"></span></div>
Glorfindel
  • 21,988
  • 13
  • 81
  • 109
beginner
  • 37
  • 5

2 Answers2

3

You can use JSoup for this..

Refer the below code.. you need to add the JSoup library to make this work.

    String html = "<html>your html code goes here</html>";

    Document doc = Jsoup.parse(html);
    Elements image = doc.getElementsByTag("img");

     for (Element el : image) {
       String src = el.absUrl("src");
       System.out.println("src attribute is : "+src);
     }
Jobin
  • 5,610
  • 5
  • 38
  • 53
-1

No need of library for that, use a regex because it's a simple operation and your program should be as lightweight as possible

Smething like:

src="//(.*?)"

Then, use Java String replaceAll(String regex, String replacement) with your rebuilt String.

N0un
  • 868
  • 8
  • 31