In Python 3.6+, is there a succinct way to encode output for JavaScript contexts?
This means, if I start with any unsanitized input string, encode it properly, then replace VALUE
below with it, all XSS attacks in the webpage will be prevented. The input won't be able to break out of the JavaScript string, nor the HTML.
<!DOCTYPE html>
<html>
<head>
<script>
var a = 'VALUE';
</script>
</head>
</html>
The link I provided above is the official OWASP cheatsheet for XSS prevention, which states that all non-alphanumeric characters must be hex escaped. They provide a Java implementation in the article, but I have not been able to find a Python implementation except this one which has not been updated since 2010. So I wrote my own:
import curses.ascii
def as_js_in_html(value):
result = ''
for char in value:
if curses.ascii.isalnum(char):
result += char
else:
char_hex = format(ord(char), 'x')
if len(char_hex) <= 2:
result += '\\x' + char_hex.rjust(2, '0')
elif len(char_hex) <= 4:
result += '\\u' + char_hex.rjust(4, '0')
else:
result += '\\U' + char_hex.rjust(8, '0')
return result
Is there a better way?