0

We have some PHP files that output JavaScript code. I know things could have been different, but that was a decision taken during the start of the project.

We have several PHP files which generate Javascript files, something like:

<?php Header("content-type: application/x-javascript");
if(strlen($_GET['country']) != 2){ exit;} //avoids code injection 
include_once($_SERVER['DOCUMENT_ROOT'].'/countries/'.$_GET['country'].'.php');?>

/*Define GLOBAL Javascript variables*/
var COUNTRY = "<?php echo $GLOBALS["country"]; ?>";
/*Language code according to ISO_639-1 codes*/
var LANGUAGE = "<?php echo $lang_CT[$GLOBALS["country"]]; ?>";
...

What is the best way to minify that code a priori, i.e., not when the file is called or echoed, but on the server, using the Javascript minifying rules?

edit: I've been thinking and things might be complex to achieve, I imagine this case of invalid JS code:

var str = <?php echo "'a string';"; ?>

which outputs a valid JS code

var str = 'a string';

but basically I was wondering if there is any basic minifying option, to remove double spaces, comments and breaklines, which would then not affect the generated JS code.

João Pimentel Ferreira
  • 14,289
  • 10
  • 80
  • 109
  • 1
    Is there a reason you want to minify the php code? Since it will be run on the server anyways, you don't need to save on data transfer (the main reason for minified code) – daniel Aug 13 '17 at 00:06
  • 1
    How does `if(strlen($_GET['country']) != 2){ exit;}` avoid code injection? – BeetleJuice Aug 13 '17 at 00:14
  • 2
    @DanLynch I believe OP wants to minify the Javascript that the php is generating. –  Aug 13 '17 at 00:14
  • 1
    @BeetleJuice ... ever tried to put some invalid pathname in two characters addressing a completely different file? – Thomas Urban Aug 13 '17 at 00:15
  • @terminus Opposing this: why referencing minify then? – Thomas Urban Aug 13 '17 at 00:15
  • Do you want to cache the minified content? Do you want to minify the content immediately before echoing it? –  Aug 13 '17 at 00:39
  • @BeetleJuice, in our case, the variable input from `$_GET` _must_ always have exactly two characters. One of the means to do code injection [in PHP](https://www.owasp.org/index.php/Code_Injection) is through `$_GET`. – João Pimentel Ferreira Aug 13 '17 at 10:32
  • @Terminus, no, because that makes page loading slower due to processing time, we want to minify the php file on the server, removing unnecessary Javascript information. – João Pimentel Ferreira Aug 13 '17 at 10:35
  • @cepharum, because minify minifies to file _directly_ on the server (creating another .min.js file) and not when the files is opened, called or echoed. – João Pimentel Ferreira Aug 13 '17 at 10:56
  • 1
    @JoãoPimentelFerreira I keep stumbling over your intention to minify _a priori_: what do you mean by that? If you don't want to minify on running the script due to some request you might use some server-side CLI to e.g. run this as part of an installation script. Still this would create files on server which is perfect in case of minify as it's creating files. If doing it in an install script isn't flexible enough you might want to use some server-side caching which is minifying the file unless it was minified in a previous request. This requires writing Javascript code to file again. – Thomas Urban Aug 13 '17 at 11:09
  • @cepharum, _a priori_ means the php file is minimized as much as possible _before_ it is echoed, to remove unnecessary data, such as double spaces and indentations. Yes, we have a deploying unix script, and as those .js.php files would be deployed to production, they would be 'minified' right away, before they would be echoed. That would improve server performance on echo. – João Pimentel Ferreira Aug 13 '17 at 18:28
  • 1
    @JoãoPimentelFerreira So I hope I'm getting you right by now: you want the PHP-scripts minified at least when it comes to some contained javascript code which is all but the embedded parts of PHP code ... I don't know any tool ready to use in this case but I'm not that much used to currently available tools. However, I'd stick with RegExp nonetheless, first one to separate PHP PIs from any outside code, next using set of RegExps provided below to minimize every non-PHP-PI sequence by e.g. reducing space and dropping comments as good as possible. – Thomas Urban Aug 13 '17 at 22:18

2 Answers2

2

AFAIK there is no support for achieving this as an integrated feature of PHP as you obviously want to minify code which is actually ignored by PHP due to not residing in process instructions <?php and ?>. So you will need to process your PHP code with another tool. If that is to be written in PHP, too, I might give the following code fragment a try:

$sequence = preg_split( '/(<\?php|<\?=|\?>)/', $phpFileCode, null, PREG_SPLIT_DELIM_CAPTURE );

$isPHP = false;
foreach ( $sequence as &$segment ) {
  switch ( $segment ) {
    case '<?php' :
    case '<?=' :
      $isPHP = true;
      break;
    case '?>' :
      $isPHP = false;
      break;
    default :
      if ( !$isPHP ) {
        $segment = preg_replace( array(
          '#\s*/\*.*?\*/\s*#', // matching multi-line comments
          '#\s*//.*$#m',       // matching single-line comments
          '#\s+#',             // matching arbitrary sequences of whitespace incl. multiple blank lines
        ), array( 
          ' ',
          ' ',
          ' ',
        ) );
      }
  }
}

$phpCodeMinified = implode( '', $sequence );

Note: This code is just a scribble ... I haven't tested it, but I think it's pointing towards some quite simple approach.

A more elaborate approach might

  1. use this code to replace all PHP process instruction sequences with some special global variable name in Javascript rather than detecting segments not belonging to PHP. E.g. use names like window.__PHPreplacedXXXX where XXXX is the numeric index of an array the whole replaced PHP segment has been pushed to for recovery in step 3 below. By using global names they won't be mangled in next step.

  2. mangle the resulting code using some JS minifier such as uglify or minify referenced before.

  3. process the resulting minified JS file and revert previous replacement of PHP sequences with global names.

This might enable use of more aggressive minification, but might result in broken JS code either when minifier is optimizing and reorganizing code. The risk isn't that high though when sticking with "simple" minifiers.

Thomas Urban
  • 4,649
  • 26
  • 32
0

Here's what I would do; put all the Javascript in a variable using Heredoc, for example:

$js = <<<EOD
console.log('Valar Morgulis');

console.log('Valar Dohaeris');
EOD;

Then use the minify_js() function that can be found here on Github, or something similar. Here's the function:

function minify_js($input) {
if(trim($input) === "") return $input;
return preg_replace(
    array(
        // Remove comment(s)
        '#\s*("(?:[^"\\\]++|\\\.)*+"|\'(?:[^\'\\\\]++|\\\.)*+\')\s*|\s*\/\*(?!\!|@cc_on)(?>[\s\S]*?\*\/)\s*|\s*(?<![\:\=])\/\/.*(?=[\n\r]|$)|^\s*|\s*$#',
        // Remove white-space(s) outside the string and regex
        '#("(?:[^"\\\]++|\\\.)*+"|\'(?:[^\'\\\\]++|\\\.)*+\'|\/\*(?>.*?\*\/)|\/(?!\/)[^\n\r]*?\/(?=[\s.,;]|[gimuy]|$))|\s*([!%&*\(\)\-=+\[\]\{\}|;:,.<>?\/])\s*#s',
        // Remove the last semicolon
        '#;+\}#',
        // Minify object attribute(s) except JSON attribute(s). From `{'foo':'bar'}` to `{foo:'bar'}`
        '#([\{,])([\'])(\d+|[a-z_][a-z0-9_]*)\2(?=\:)#i',
        // --ibid. From `foo['bar']` to `foo.bar`
        '#([a-z0-9_\)\]])\[([\'"])([a-z_][a-z0-9_]*)\2\]#i'
    ),
    array(
        '$1',
        '$1$2',
        '}',
        '$1$3',
        '$1.$3'
    ),
$input);
}

minify_js($js); would get you the minified JS. You can use variables inside Heredoc, more information here.

Mav
  • 1,087
  • 1
  • 15
  • 37
  • but then it minifies when the file is echoed, and not _a priori_ on the server – João Pimentel Ferreira Aug 13 '17 at 10:39
  • @JoãoPimentelFerreira are you running the JavaScript on the server? Usually JavaScript is used as a client side language, so minifying it on echo is what you want. – Mav Aug 13 '17 at 10:43
  • I know Javascript is client side and php is server side, but minifying as it is echoed takes time, and processing time is very important to us, so that is not an option. I was wondering if there is any basic minifying option, such as double blank lines, comments and breaklines. That would be possible to do _directly_ on the PHP file. – João Pimentel Ferreira Aug 13 '17 at 10:53
  • @JoãoPimentelFerreira so you mean you'd like to minify all the JavaScript inside your PHP files? Something like passing each PHP file generating js through an additional PHP script as part of production to minify? – Mav Aug 13 '17 at 10:56
  • no, we want a singleton `php` file. But in that file, one can easily, with no harm, for example, remove double spaces. – João Pimentel Ferreira Aug 13 '17 at 10:59
  • @JoãoPimentelFerreira I am not aware of any such simple option to minify Javascript except for using a bunch of carefully designed regular expressions to apply on your code such as given in this answer. But that's still part of your PHP script and gets run on every request unless you implement some server-side caching to minify on first request, only. Server-side caching of resources isn't integrated with PHP as a core feature, but you might stick with some extension e.g. as provided by composer. – Thomas Urban Aug 13 '17 at 11:11
  • Is there any simple minification to merely remove double spaces, comments and breaklines? That would already give a relevant compression, without causing any harm to the generated js code? And that would solve the problem. – João Pimentel Ferreira Aug 13 '17 at 11:15
  • 2
    @JoãoPimentelFerreira If you're seriously squeezing to the point that running a basic regex minification is "too much", then you wouldn't be generating JS files with PHP in the first place, you would have pre-generated JS files such as `en.js`, `fr.js`... – Niet the Dark Absol Aug 13 '17 at 11:24
  • @NiettheDarkAbsol, you're right `:)`. That's a threshold we achieved along the project, we have a singleton php file for all countries (much easier to maintain), but we still want performance (as possible). – João Pimentel Ferreira Aug 13 '17 at 18:21