I am trying to secure, as best as possible, a comment form in a non-CMS environment with no user authentication.
The form should be secure against both browser and curl/postman type requests.
Environment
Backend - Node.js, MongoDB Atlas and Azure web app.
Frontend - jQuery.
Below is a detailed, but hopefully not too overwhelming, overview of my current working implementation.
Following that are my questions about the implementation.
Related Libraries Used
Helmet - helps secure Express apps by setting various HTTP headers, including Content Security Policy
reCaptcha v3 - protects against spam and other types of automated abuse
DOMPurify - an XSS sanitizer
validator.js - a library of string validators and sanitizers
he - an HTML entity encoder/decoder
The general flow of data is:
/*
on click event:
- get sanitized data
- perform some validations
- html encode the values
- get recaptcha v3 token from google
- send all data, including token, to server
- send token to google to verify
- if the response 'score' is above 0.5, add the submission to the database
- return the entry to the client and populate the DOM with the submission
*/
POST request - browser
// test input:
// <script>alert("hi!")</script><h1>hello there!</h1> <a href="">link</a>
// sanitize the input
var sanitized_input_1_text = DOMPurify.sanitize($input_1.val().trim(), { SAFE_FOR_JQUERY: true });
var sanitized_input_2_text = DOMPurify.sanitize($input_2.val().trim(), { SAFE_FOR_JQUERY: true });
// validation - make sure input is between 1 and 140 characters
var input_1_text_valid_length = validator.isLength(sanitized_input_1_text, { min: 1, max: 140 });
var input_2_text_valid_length = validator.isLength(sanitized_input_2_text, { min: 1, max: 140 });
// if validations pass
if (input_1_text_valid_length === true && input_2_text_valid_length === true) {
/*
encode the sanitized input
not sure if i should encode BEFORE adding to MongoDB
or just add to database "as is" and encode BEFORE displaying in the DOM with $("#ouput").html(html_content);
*/
var sanitized_encoded_input_1_text = he.encode(input_1_text);
var sanitized_encoded_input_2_text = he.encode(input_2_text);
// define parameters to send to database
var parameters = {};
parameters.input_1_text = sanitized_encoded_input_1_text;
parameters.input_2_text = sanitized_encoded_input_2_text;
// get token from google and send token and input to database
// see: https://developers.google.com/recaptcha/docs/v3#programmatically_invoke_the_challenge
grecaptcha.ready(function() {
grecaptcha.execute('site-key-here', { action: 'submit' }).then(function(token) {
parameters.token = token;
jquery_ajax_call_to_my_api(parameters);
});
});
}
POST request - server
var secret_key = process.env.RECAPTCHA_SECRET_SITE_KEY;
var token = req.body.token;
var url = `https://www.google.com/recaptcha/api/siteverify?secret=${secret_key}&response=${token}`;
// verify recaptcha token with google
var response = await fetch(url);
var response_json = await response.json();
var score = response_json.score;
var document = {};
/*
if google's response 'score' is greater than 0.5,
add submission to the database and populate client DOM with $("#output").prepend(html);
see: https://developers.google.com/recaptcha/docs/v3#interpreting_the_score
*/
if (score >= 0.5) {
// add submission to database
// return submisson to client to update the DOM
// DOM will just display this text: <h1>hello there!</h1> <a href="">link</a>
});
GET request on page load
Logic/Assumptions:
- Get all submissions, return to client and add to DOM with
$("#output").html(html_content);
. - Don't need to encode values before populating DOM because values are already encoded in database?
POST request from curl, postman etc
Logic/Assumptions:
- They don't have google token, and therefore can't verify it from server, and can't add entries to the database?
Helmet configuration on server
app.use(
helmet({
contentSecurityPolicy: {
directives: {
defaultSrc: ["'self'"],
scriptSrc: ["'self'", "https://somedomain.io", "https://maps.googleapis.com", "https://www.google.com", "https://www.gstatic.com"],
styleSrc: ["'self'", "fonts.googleapis.com", "'unsafe-inline'"],
fontSrc: ["'self'", "fonts.gstatic.com"],
imgSrc: ["'self'", "https://maps.gstatic.com", "https://maps.googleapis.com", "data:"],
frameSrc: ["'self'", "https://www.google.com"]
}
},
})
);
Questions
Should I add values to the MongoDB database as HTML encoded entities OR store them "as is" and just encode them before populating the DOM with them?
If the values were to be saved as html entities in MongoDB, would this make searching the database for content difficult because searching for, for example
"<h1>hello there!</h1> <a href="">link</a>
wouldn't return any results because the value in the database was<h1>hello there!</h1> <a href="">link</a>
In my reading about securing web forms, much has been said about client side practises being fairly redundant as anything can be changed in the DOM, JavaScript can be disabled, and requests can be made directly to the API endpoint using curl or postman and therefore bypass any client side approaches.
With that said should sanitization (DOMPurify), validation (validator.js) and encoding (he) be performed either: 1) client side only 2) client side and server side or 3) server side only?
For thoroughness, here is another related question:
Do any of the following components do any automatic escaping or HTML encoding when sending data from client to server? I ask because if they do, it may make some manual escaping or encoding unnecessary.
- jQuery ajax() requests
- Node.js
- Express
- Helmet
- bodyParser (node package)
- MongoDB native driver
- MongoDB