0

Inside a Rails Service Object I have this:

test = %x[node lib/test.js "LOOK AT ME!"]
puts test

In a Javascript file, lib/test.js, I have this:

var argument = process.argv[2];
console.log(argument);

This returns LOOK AT ME! in the Rails server log as you'd expect.

If I run puts test.downcase it returns look at me! in the Rails server log as you'd expect.

Everything seems to be working until I try processing an HTML document.

In the Service Object I have this:

request = HTTParty.get("https://example.com/")
document = Nokogiri::HTML(request.body)

test = %x[node lib/test.js "#{document}"]
puts test

This is throwing an error in the Rails server log:

sh: 14: amp: not found
sh: 6: initial-scale=1.0>
<meta name=generator content=Jekyll: not found
sh: 30: row>
<img src=https://example.com/images/example.jpg alt=Example: not found
<!DOCTYPE html>
<html lang=en>
<head>
<meta http-equiv=Content-Type content=text/html

It looks like sh: 14: amp is referencing a & in the Google fonts call. Perhaps I need to do some kind of escaping?

How can I pass document to that Javascript file and have it return without error?

Update:

If I call the file using fs.readFile, as shown below, I get the expected return in the Rails server log.

var fs = require('fs');

fs.readFile('/path/to/document.html', 'utf8', function(err, data) {
  if (err) throw err;
  console.log(data);
});
Brad West
  • 943
  • 7
  • 19

1 Answers1

0

The %x[ ... ] syntax executes the provided string as a shell command. Here, interpolating an entire HTML document into the argument is equivalent to copy-pasting a bunch of HTML code straight into your terminal: unless properly escaped, the very first " character will terminate the argument and cause the rest of the HTML code to be interpreted as a shell command.

A naive approach at fixing this would be escaping all " characters – but if you do that you'll quickly find out that there are a whole bunch of other characters that trigger special behavior in sh. Trying to escape them all is an uphill battle and unless you do a perfect job you may also end up with a remote code execution vulnerability if any part of your HTML comes from an untrusted source.

There are a few possibilities to sidestep this issue:

1.   Use a temporary file

test = Tempfile.create do |f|
  f.write document
  f.close
  %x[node lib/test.js "#{f.path}"]
end
const fs = require('fs');

fs.readFile(process.argv[2], 'utf8', function(err, data) {
  if (err) throw err;
  console.log(data);
});

2.   Use the standard input

require "open3"
output, status = Open3.capture2("node ./lib/test.js", stdin_data: document)
const fs = require('fs');

fs.readFile(0, 'utf8', function(err, data) {
  if (err) throw err;
  console.log(data);
});

I have not tested these code snippets so they may require slight modifications.

Mate Solymosi
  • 5,699
  • 23
  • 30
  • Thank you for this Máté. Those errors make perfect sense now. And thank you for the code options, these are very helpful. – Brad West Feb 17 '21 at 13:44