5

I'm writing a very basic Markdown to HTML converter in C#.

I managed to write regular expressions to convert bold and italic text, but I'm struggling to come up with a piece of regex which can transform a markdown link into a link tag in html.

For example:

This is a [link](/url) 

should become

This is a <a href='/url'>link</a>

This is my code so far:

var bold = new Regex(@"(\*\*|__) (?=\S) (.+?[*_]*) (?<=\S) \1", // Regex for bold text
        RegexOptions.IgnorePatternWhitespace | RegexOptions.Singleline | RegexOptions.Compiled);
var italic = new Regex(@"(\*|_) (?=\S) (.+?) (?<=\S) \1", // Regex for italic text
        RegexOptions.IgnorePatternWhitespace | RegexOptions.Singleline | RegexOptions.Compiled);
var anchor = new Regex(@"??????????", // Regex for hyperlink text
        RegexOptions.Singleline | RegexOptions.IgnorePatternWhitespace | RegexOptions.Compiled);

content = bold.Replace(content, @"<b>$2</b>");
content = italic.Replace(content, @"<i>$2</i>");
content = anchor.Replace(content, @"<a href='$3'>$2</a>");

What kind of regular expression can accomplish this?

jwpfox
  • 5,124
  • 11
  • 45
  • 42
tocqueville
  • 5,270
  • 2
  • 40
  • 54

2 Answers2

10

in markdown there may be two ways of url:

[sample link](http://example.com/)
[sample link](http://example.com/ "with title")

regex from the solution that Addison showed would work only at first type, and only on urls starting with /. for example a [link name](http://stackoverflow.com/questions/40177342/regex-convert-a-markdown-inline-link-into-an-html-link-with-c-sharp "link to this question") wont work

here is regex working at both

\[([^]]*)\]\(([^\s^\)]*)[\s\)]

https://regex101.com/r/kZbw7g/1

Misiakw
  • 902
  • 8
  • 28
  • Yes, correct. In my specific case, though, I don't care about supporting all the variants, I'll support just a small subset of the Markdown syntax. – tocqueville Oct 21 '16 at 13:49
  • ok, but after some years of experience i prefer to spend some more time for complex solution, rather than spend much mote time later to check why does software crashes. and because i'm rather scatterbrained i've made habbit of making things ideal – Misiakw Oct 21 '16 at 13:56
5

Try replacing this

\[(.+)\]\((\/.+)\)

With this:

<a href='\2'>\1</a>

Example: https://regex101.com/r/ur35s8/2

Addison
  • 7,322
  • 2
  • 39
  • 55
  • 1
    this solution won't work with markup like this `[link name](http://stackoverflow.com/questions/40177342/regex-convert-a-markdown-inline-link-into-an-html-link-with-c-sharp "link to this question")` – Misiakw Oct 21 '16 at 13:45
  • This is true, but OP specified that he wanted `/url` to be parsed. He possibly doesn't' want links pointing away from his site? – Addison Oct 21 '16 at 15:20