3

I'm managing a blog where a select few people can submit their own articles and entries. I want them to be able to embed video via HTML (and bold, italicize, etc text at their choosing). How do I do this while maintaining site security?

If I don't HTML escape the actual article space, an open comment will ruin my site. Is there a way to selectively escape some combination of characters?

edit; hopefully without writing my own parser. I just want simple things like <b>, <i>, etc tags unescaped, as well as video and link embedding.

hakre
  • 193,403
  • 52
  • 435
  • 836
gator
  • 3,465
  • 8
  • 36
  • 76
  • You don't say what language you're using, but chances are that you're not going to have to write your own parser. HTML parsing is a solved problem and packages abound to do it for you. – Andy Lester Jan 13 '13 at 15:26
  • I've tagged the question as PHP and HTML. – gator Jan 13 '13 at 15:28
  • Then start here: http://stackoverflow.com/questions/292926/robust-mature-html-parser-for-php – Andy Lester Jan 13 '13 at 15:54

3 Answers3

3

I use what SO uses. it is opensource and has parsers for many languages.

The name is WMD and the question "Where's the WMD editor open source project?" has some QA material outlining this editor.

The question "running showdown.js serverside to conver Markdown to HTML (in PHP)" has some QA material outlining some Markdown libraries in PHP.

Community
  • 1
  • 1
Itay Moav -Malimovka
  • 52,579
  • 61
  • 190
  • 278
  • 1
    Take care: To run this in a safe manner, you need server-side execution of the javascript code, e.g. via node.js *or* you must have something compatbile in PHP, like a markdown library that works similar to the javascript code of that extension. It might be useful to add this second part to the answer as well. – hakre Jan 13 '13 at 15:03
  • @hakre - markdown is common enough all languages have parsers for it. fairly easy finding one. I will add in the answer the link to the PHP one – Itay Moav -Malimovka Jan 13 '13 at 18:46
0

The simplest way to do this that most sites (such as SO) use is to introduce your own special markup, which is then translated into the features that you want.

For example, SO uses asterisks (*) to italicize and (**) bold (Edit: next to the HTML tags <b></b> itself, see source of this answer).

Other sites use [b] and [i] tags. You could have a [video=http://myvideo.com] tag, which your PHP then translates into the appropriate HTML entity.

Chris Hayes
  • 11,471
  • 4
  • 32
  • 47
  • 1
    SO has no special markup, it *allows* HTML tags (as per the Markdown specification btw.) So you are not answering any of the posed questions, instead I suggest for learning more, continue here: http://www.codinghorror.com/blog/2008/10/programming-is-hard-lets-go-shopping.html – hakre Jan 13 '13 at 14:49
  • 1
    It absolutely has special markup. The fact that HTML is supported alongside it does not negate this. – Chris Hayes Jan 13 '13 at 15:01
  • Well, it does not negate this, however it makes your answer being not one. You removed the problem by pointing to special markup which apparently in the given example allows HTML tags the author *is* concerned about with the question. – hakre Jan 13 '13 at 15:33
  • I don't understand your point. The fact that markup exists which does allow these tags doesn't mean that the author has to allow them as well. Regardless, he updated his question saying that he doesn't want to write a parser, so it's moot now. – Chris Hayes Jan 13 '13 at 16:29
  • Consider your stackoverflow example. Now imagine there is a user making use of stackoverflow only entering the special HTML markup. – hakre Jan 13 '13 at 16:41
0

You can safely HTML escape everything. URL's for your videos will be unaffected by whatever escaping you want to do.

Sanchit
  • 2,240
  • 4
  • 23
  • 34