-2

I have html string in which some elements are having ids which contains colons like id=detailMainForm:PCTBiblio. So my question is how to parse such elements?

Subhash
  • 21
  • 1
  • 6
  • Are there no quotes around the id, like so: `id="detailMainForm:PCTBiblio"`? If not, you could just parse everything between `id=` and the next whitespace you encounter? – domsson Feb 22 '17 at 11:37
  • @domdom id is same as id="detailMainForm:PCTBiblio", Sorry for mistyping – Subhash Feb 22 '17 at 11:42
  • So what problem does the colon pose? Just parse everything within the quotes, including the colon. – domsson Feb 22 '17 at 11:46
  • Subhash, what came of this? Did this, in fact, turn out to be a bug in the parser project you l inked? Did you file a bug report or come up with a workaround? – domsson Jul 04 '17 at 11:04

1 Answers1

0

I'll try to answer this based on what little information you supplied.

First, let me point out that the colon should be of no relevance to PHP. The parsing should be no different for ids with or without a colon. Also note that a colon is a valid character for the id attribute, both in HTML4 as well as HTML5 (see this question).

Now, to actually parse the ids, you can basically either make use of regular expressions or use php's string functions. In the latter case, you could search the source for id=" and then read everything from there until the next ".

Depending on what you're doing exactly, the DOMDocument class could also be useful to you.

Community
  • 1
  • 1
domsson
  • 4,553
  • 2
  • 22
  • 40
  • I was following https://github.com/paquettg/php-html-parser to parse the html string I had. But I could not get the elements those were having ids with colon. So first i found and replaced ids with colons with ids without colons and then found the elements. Thank you for you answers. – Subhash Feb 23 '17 at 07:35
  • @Subhash maybe you should file a bug report to the project then, as - from what I understand - a HTML parser should not fail because of colons within an id. – domsson Feb 23 '17 at 10:08