1

I have a HTML code where i have the div with same id can we extract the second one.

HTML code

<div id="test>example </div>
<div id ="test">example11</div>

I need to extract the example11

This works (?s)<div id="test>.*<div id ="test">(.*?)</div> but i have a lot of div with same ID so this wont be good so can any one tell me do we have any other way to extract the content.

I know REGEX is not good for HTML paring and i have no choice.

Kathick
  • 1,395
  • 5
  • 19
  • 30
  • 3
    it's a very bad idea to use the same id twice... – Nitram76 May 02 '13 at 09:52
  • 1
    Why, did you say "I have no choice"? Regexp is a right choice for lexical layer not for grammar layer. – Aubin May 02 '13 at 09:53
  • I know but the HTML content is not mine i just need to parse it. – Kathick May 02 '13 at 09:53
  • 1
    @Aubin For HTML we have a lot of parsers like Jsoup,etc it will be very easy to parse but the thing is i can only parse using regex thats why i said i have no choice – Kathick May 02 '13 at 09:54
  • 2
    don't parse HTML or XML with regex or [Cthulhu will claim your soul](http://stackoverflow.com/a/1732454/342852). Use a parser like JSoup instead! – Sean Patrick Floyd May 02 '13 at 09:57

1 Answers1

0

try this !

<div.*>.*</div><div.*>(.*)</div>

now you can select the first group. and its done ;)

a dirty solution would be

<div.*>.*</div><div.*>.*</div><div.*>.*</div><div.*>.*</div><div.*>.*</div><div.*>.*</div><div.*>.*</div><div.*>.*</div><div.*>.*</div><div.*>.*</div><div.*>(.*)</div>

hehe but aint so proud about this one ofc....uhm...will think about it..

Dennis Anderson
  • 1,348
  • 2
  • 14
  • 33
  • Yes i can but if i have some 14 div with same id or class and i need to get the 13th div what should i do..Even this works fine
    (.*?) which is already in the question.
    – Kathick May 02 '13 at 09:57
  • How do you know you need the 13th and not the 11th? What's the discriminant? – Aubin May 02 '13 at 09:59