8

This is not a duplicate of this question.

I want to make my own rules to a markdown parser like the one here in StackOverflow.

That means, converting *italic* into <span style="font-style:italic">italic</span>.

I know there are a lot of Parsers out there, but i don't understand them. The question previously mentioned doesn't really give me a lot to go on, it just links to more parsers and doesn't explain how they work.

So I would like to know the basics, or the logic, of creating a whole markdown parser, and if you think explaining it to me it's not a pleasent task then don't. Thanks for understanding :)

Community
  • 1
  • 1
undefined
  • 3,949
  • 4
  • 26
  • 38

2 Answers2

8

A common way of doing this, is by using a RegExp expression, followed by the replace method.

This is one way you could do this:

"*This is italic*"
      .replace(/\*(.*?)\*/gi, '<span style="font-style: italic">$1</span>');

What is happening here is that you are searching for any sequence of characters surrounded by the two asteriks and capturing those characters so then you can put them between the HTML tags.

Afonso Matos
  • 2,406
  • 1
  • 20
  • 30
  • 1
    Thanks! But how would you go about doing the `[Google](http://www.google.com)` one for example? Is there a better alternative to RegExp's? :) – undefined Dec 30 '14 at 23:39
  • 1
    You should learn RegExp before this. `"[Google](http://google.com)".replace(/\[(.*?)\]\((.*?)\)/gi, '$1');` Be sure to vote the question and mark it if it helped you. – Afonso Matos Dec 31 '14 at 11:07
  • 1
    Thanks man, I'm a little spectic about regExp's because some people told me I shouldn't parse with it, anyway.. – undefined Dec 31 '14 at 16:30
  • 1
    @Rou It's true that you should only use RegExp when really needed. I think this is the case. – Afonso Matos Dec 31 '14 at 21:44
  • Note that you will not be able to parse all of Markdown with regexes without making a mess of your code, like Gruber's original markdown implementation. You should learn about writing proper parsers first. – mb21 Jan 01 '15 at 11:26
  • 6
    Where can one learn about writing proper parsers? Came here looking for the answer to that question. – Captain Delano Oct 16 '20 at 16:46
3

Using regular expressions will work for writing a Markdown parser, given its very simple syntax. However, as others have mentioned, regex won't work for all possible flavors of Markdown, and it won't teach you some of the core concepts of how "real" code compilers work—if that's ultimately what you're after.

If you're interested in writing a more fully fledged Markdown compiler and learning some core concepts along the way, I recommend this three-part guide: Writing a Markdown Compiler. It uses Ruby for its code references, but I think the main concepts and code will be understandable to any intermediate programmer. What I like about it is that it breaks down the concepts of tokenizing and parsing, which are key concepts for any proper parser, not just a Markdown parser. The guide should give you a baseline for creating any code compiler. However, the learning curve only gets much steeper once you move beyond Markdown, so not for the faint of heart!

Adam Stevenson
  • 657
  • 3
  • 11