-1

I have

<table id="needle"><tr><td>X</td></tr><table>...</table><table>...</table></table>

I found X position, then I found < position of #needle table, and how to find position of last > pair #needle tag

CodeCaster
  • 147,647
  • 23
  • 218
  • 272
Vyacheslav Loginov
  • 3,136
  • 5
  • 33
  • 49
  • 5
    *(related)* [Best Methods to parse HTML](http://stackoverflow.com/questions/3577641/best-methods-to-parse-html/3577662#3577662) – Gordon Nov 15 '11 at 09:22
  • 1
    *(reference)* http://php.net/manual/en/domdocument.getelementbyid.php – Gordon Nov 15 '11 at 09:24
  • possible duplicate of [read XML tag id from php](http://stackoverflow.com/questions/3035310/read-xml-tag-id-from-php) – Gordon Nov 15 '11 at 09:25
  • @Gordon It's not really a duplicate. He's looking for the end tag, not the content. – Madara's Ghost Nov 15 '11 at 09:29
  • @Truth to do what? What is the purpose of looking for the end tag if its not for doing something with the content afterwards? Even if the OP really just wants to know the position, it is much easier to get that information from a parser. – Gordon Nov 15 '11 at 09:47
  • To know where the tag ends? For instance? To add an element after it ends but before the rest of the document continues? There're a million things. Even if he does intend to use the content, the question is not an ***exact*** duplicate, as the answers to it will be entirely different. – Madara's Ghost Nov 15 '11 at 09:50
  • @Truth All that implies that the OP wants to work with the DOM tree, so using a parser is the better choice. Also, if we only close on *exact duplicates* we wont close anything at all because of the many subtle differences. One guy wants to parse divs, the other links, etc. The linked duplicate is *exact enough*. – Gordon Nov 15 '11 at 09:53
  • @VyacheslavLoginov please clarify the question. What is it you want to do? Your current approach sounds like you are doing it in a more difficult way than it has to be. – Gordon Nov 15 '11 at 10:01
  • I have already parse my page with simple_html_dom, now I want to cut excess tags – Vyacheslav Loginov Nov 15 '11 at 10:10
  • What do you mean by excess tags? Please provide an example with input and output so we can see what you are trying to do. It sounds like you are doing it wrong right now. Do you want to get the innerHTML (in SimpleHTMLDOM: innerText attribute of #needle element)? – Gordon Nov 15 '11 at 10:11
  • It's post parsing manipulation, I want to remove some places from parsed content – Vyacheslav Loginov Nov 15 '11 at 10:14
  • That is as clear as mud. Please given input and output example. And also show us some of your code. – Gordon Nov 15 '11 at 10:17

1 Answers1

2

you could use html5lib. Less recommended but works too: YQL by yahoo

EDIT: removed regex because of comments. The probably best overview had gordon listed in his comment: How do you parse and process HTML/XML in PHP?

Community
  • 1
  • 1
endo.anaconda
  • 2,449
  • 4
  • 29
  • 55