This question may have been asked in a different way, if so please point it to me. I just couldn't find it among my search results.
I would like to parse text for mark-ups, like those here on SO.
- eg.
* some string
for bullet list - eg.
*some string*
for italic text - eg.
&some string&
for a URL - eg.
&some string&specific url&
for URL different from string
etc.
I can think of two ways to go about processing a string to find out special mark-up sequences:
a. I could proceed in a character-centric way, i.e. parsing the string looking for sequences 1, then 2 etc. That however seems to be inefficient as it would have to parse the string multiple times.
b. It seems better to process the string character by character and keep a memory of special characters and their position. If the memory matches a special sequence as above, then the special characters are replaced by HTML in the string. I'm not really sure whether that's a better idea however, nor am I sure as to how one should implement it.
What is the best way to go about this? How about Regular Expressions? Does it follow pattern a or b? Is there a third option?
P.S. I am using Python. Python example most appreciated.