5

The TipTap editor and its progeny quasar-tiptap can export user created content in the browser in both HTML and JSON formats. If I plan on allowing round trips of the user data to my server what are the pros and cons of using either format for storage.

I would assume HTML has a greater likelihood of XSS attacks, and indeed such a vulnerability has been found (and rectified) in the past. And using JSON would be easier for backend parsing should it ever be required.

Beyond this, are there any major benefits of using either format? Preserving fidelity of user input is important. Size is important (any differences in image storage?). Editor performance is important. Scripting attack vulnerability is extremely important.

Which to choose?

Jay Borseth
  • 1,894
  • 20
  • 29

1 Answers1

2

I am sure at 12 months later you have made your choice. The usual caveats that this question is not really answerable and can only really elicit opinions. However, the best I can offer (and I am definately no expert):

  1. User entered text is a risk (cross-site scripting, SQL injection attack, potentially creating a false alert dialog with disinformation for the user reading another user's content) regardless of how the rich text data is sent to the server and/or stored. This is because eventually the content will be a) very likely stored in a database b) sent to and interpreted by a browser in HTML format. Whether it goes through an intermediary JSON format is largely irrelevant. Stripping out all unwanted tags, attributes and SQL commands is going to be an extremely important server responsibility (known as sanitising), whatever the format. Many more well tested sanitising libraries exist for HTML in many more languages/server side technologies than for the relatively niche ProseMirror JSON format used by TipTap.
  2. The TipTap documentation talks about the advantages of the 2 formats here
  3. The ProseMirror JSON format is reasonably verbose, and inspecting both the JSON and HTML side by side, the JSON format seems to require more characters. Obviously it will depend on the rich text itself - how many formatting marks etc, but I would not worry about this in terms of server storage. Given you are likely using Vue or React with TipTap, it is likely you can set up an editor and have 2 seperate divs displaying the output side by side. You can see the 2 formats for the same text by scrolling to the bottom of this page
  4. Editor performance - if you store the data as HTML you can pass it straight to the client. If you store it as JSON you will have to parse it and create the HTML. If you do this client side, the client will have to download and execute a library to perform this task - on NPM this is currently 14kB, so not a big issue.
  5. Both formats will almost certainly send image data to the server as Base64 strings, so no bandwidth will be saved with either format for image files. Base64 is less efficient for DB storage as compared to saving as binary objects, particularly if compression is used, so you could strip these out at the server, but this will have a cost in time spent developing and testing the backend which may well be better spent on other things
Brent
  • 4,611
  • 4
  • 38
  • 55