1

I have a string that will be mixed with text and html markup that I would like to parse and deal with accordingly. The HTML markup with include references to record ID's that I can use later when compiling text and mention segments for a post.

For the most part, I'm understanding how to split out the individual segments, but I don't know how to get them back in the correct order in which they came.

An example string:
Hi <span contenteditable="false" data-mention="@005i0000003KteOAAS">First Name</span>

I can parse into 'Hi ' and '005i0000003KteOAAS' individually, but how do I get them back in the original order?

I'm using a regex like this currently:
<(?i).*?<\\/.*?>

Alix Ohrt
  • 209
  • 2
  • 10
  • 1
    Can you clarify what you want? What do you mean get them back in the original order? – miken32 Dec 04 '15 at 22:33
  • it looks like youre trying to parse some kind of html/xml/something that uses a `` format. Regex is infamously awful for parsing these kind of things, I suggest looking into parsing tools for whatever language you are using – R Nar Dec 04 '15 at 22:49
  • 1
    Your array of matches will come back in the order that they were found. I think perhaps you are asking how to tell which element is just a string and which is a piece of data that you want? If there is no comparison you can do that identifies data then you'll need to construct your regex so it returns "data-mention=....." and then you can key off of return values that start with "data-mention", or else you will have to use a regex in a loop and do each, one at a time – Scott Dec 04 '15 at 22:51
  • You could use a regex to split out the html tags. Some split functions allow's capturing the delimiters if you put them into capture groups. If you could do this, you could know which ones are tags. Then you might be able to search the ones that are tags, for data, like attr-val's. –  Dec 05 '15 at 00:43
  • I'll just leave this here: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – Barton Chittenden Dec 05 '15 at 00:51

0 Answers0