I need to parse the HTML page with Java to retrieve some data.
For example, from incoming.html
<html>
<head>
<title>TITLE</title>
<meta name="some name" content="some content" />
<link type=".." title=".." rel=".." href="link" />
<script type="text/javascript">..</script>
</head>
<body>
<!--googleoff:all-->
<img src="image.jpg"/>
<div class="div1"></div>
<div class="Logo"><a href="/"><img src="logo.png"/></a></div>
<div class="div2"></div>
<ul>
<li class=".."><a href="/”>a</a></li>
<li class=".."><a href="/”>b</a></li>
</ul>
<div class="div1"></div>
<div class="Logo"><a href="/"><img src="other.png"/></a></div>
<div class=”div2”></div>
<ul>
<li class=".."><a href="/”>a</a></li>
<li class=".."><a href="/”>b</a></li>
</ul>
<!--googleon:all-->
</body>
</html>
I need to receive outcoming.html
<html>
<head>
<title>TITLE</title>
<meta name="some name" content="some content" />
<link type=".." title=".." rel=".." href="link" />
<script type="text/javascript">..</script>
</head>
<body>
<div class="Logo"><a href="/"><img src="other.png"/></a></div>
<div class=”div2”></div>
</body>
</html>
The purpose of the issue:
How to choose from 2 equals tags that have as difference only their contents.
In my case I have two tags:
<div class="Logo"><a href="/"><img src="logo.png"/></a></div>
and
<div class="Logo"><a href="/"><img src="other.png"/></a></div>
but I need only the tag where src="other.png"
What do you think the best way to do it?