1

I'm receiving data in Django using the editor Quill, data formatted as HTML.

It is possible to encode/clean the data when I push in the database, and when I retrieve to be back in html ? If yes how ?

Also I use only paragraph,lists and
(this is passed by editor), but I want to check if the user doesn't add anything else in code.

For example:

I get from the editor:

<li>fdsafdsafdsa</li><li>fdsafdafsdafds</li>

In the database I want to save as(now I save as html):

&lt;li&gt;fdsafdsa&lt;/li&gt;&lt;li&gt;fdsafdsa&lt;/li

When I push back to page, I serve back:

<li>fdsafdsafdsa</li><li>fdsafdafsdafds</li>
user3541631
  • 3,686
  • 8
  • 48
  • 115

2 Answers2

2

You could save the html in your database in a text field.

class UserGeneratedHtml(models.Model)
    html = models.TextField()

Then before saving this data make sure that it is actually valid html. You could do this using a html parser like BeautifulSoup:

from bs4 import BeautifulSoup
html = """<html>
<head><title>I'm title</title></head>
</html>"""
non_html = "This is not an html"
bool(BeautifulSoup(html, "html.parser").find())
True
bool(BeautifulSoup(non_html, "html.parser").find())
False

This code snipplet checks if there is any html element inside the string.related answer to the snipplet above

Of course saving and serving user generated html is always tricky and possibly dangerous so you should always make sure that the html does not contain possibly dangerous things. You could use BeautifulSoup to parse the generated html and if it contains anything else than paragraphs and lists reject it.

If you want to render the user generated html in the template you could simple render it like this:

{{ html |safe }}
matyas
  • 2,696
  • 23
  • 29
  • 1
    This approach is solid but sadly even with "|safe" the user could write down some XML requests in JavaScript and can cause trouble on your page. `

    lalalallala

    ` could also be done or a endless for loop could be started which would freeze the browser tab...
    – hansTheFranz Feb 01 '19 at 11:34
  • I updated by question with an example. To the user I need to show html, but I don't want html character in database, also allow users non-compliant tags or not allowed tags – user3541631 Feb 01 '19 at 12:01
1

I finally decided to use the bleach package form Mozilla like this:

value = bleach.clean(value, tags=['p', 'ul', 'ol', 'li', 'br'])
user3541631
  • 3,686
  • 8
  • 48
  • 115