3

I'm watching this Google I/O presentation from 2011 https://www.youtube.com/watch?v=M3uWx-fhjUc

At minute 39:31, Michael shows the output of the closure compiler, which looks like the code included below.

My question is what exactly is this code doing (how and why)

// Question #1 - floor & random? 2147483648?
Math.floor(Math.random() * 2147483648).toString(36);

var b = /&/g, 
    c = /</g,d=/>/g, 
    e = /\"/g, 
    f = /[&<>\"]/;

// Question #2 - sanitizing input, I get it... 
// but f.test(a) && ([replaces]) ?
function g(a) {
   a = String(a);

   f.test(a) && (
      a.indexOf("&") != -1 && (a = a.replace(b, "&amp;")), 
      a.indexOf("<") != -1 && (a = a.replace(c, "&lt;")), 
      a.indexOf(">") != -1 && (a = a.replace(d, "&gt;")),
      a.indexOf('"') != -1 && (a = a.replace(e, "&quot;"))
   );

   return a;
};

// Question #3 - void 0 ???    
var h = document.getElementById("submit-button"),
    i,
    j = {
       label: void 0,
       a: void 0
    };
i = '<button title="' + g(j.a) + '"><span>' + g(j.label) + "</span></button>";
h.innerHTML = i;

Edit

Thanks for the insightful answers. I'm still really curious about the reason why the compiler threw in that random string generation at the top of the script. Surely there must be a good reason for it. Anyone???

rodrigo-silveira
  • 12,607
  • 11
  • 69
  • 123
  • Having looked at the slide, have updated my answer; I strongly suspect the code snippet shown was cropped slightly to take out the left-hand-side of an assignment. – Adrian Wragg Aug 21 '13 at 22:55

4 Answers4

2

When in doubt, check other bases.

2147483648 (base 10) = 0x80000000 (base 16). So it's just making a random number which is within the range of a 32-bit signed int. floor is converting it to an actual int, then toString(36) is converting it to a 36-character alphabet, which is 0-9 (10 characters) plus a-z (26 characters).

The end-result of that first line is a string of random numbers and letters. There will be 6 of them (36^6 = 2176782336), but the first one won't be quite as random as the others (won't be late in the alphabet). Edit: Adrian has worked this out properly in his answer; the first letter can be any of the 36 characters, but is slightly less likely to be Z. The other letters have a small bias towards lower values.

For question 2, if you mean this a = String(a); then yes, it is ensuring that a is a string. This is also a hint to the compiler so that it can make better optimisations if it's able to convert it to machine code (I don't know if they can for strings though).

Edit: OK you clarified the question. f.test(a) && (...) is a common trick which uses short-circuit evaluation. It's effectively saying if(f.test(a)){...}. Don't use it like that in real code because it makes it less readable (although in some cases it is more readable). If you're wondering about test, it's to do with regular expressions.

For question 3, it's new to me too! But see here: What does `void 0` mean? (quick google search. Turns out it's interesting, but weird)

Community
  • 1
  • 1
Dave
  • 44,275
  • 12
  • 65
  • 105
  • About #2: I see. I get it now. So check if any of those characters in [&<>\"] are present, and if so, proceed to replace them. Thanks! – rodrigo-silveira Aug 21 '13 at 22:37
  • Using the commas and `&&` instead of actual lines of code makes the program hard to read but slightly smaller in size. Or so it has the reputation. Smaller loads faster and uses less bandwidth, which matters if you are serving this up millions of times. – Lee Meador Aug 21 '13 at 22:37
  • @LeeMeador that's what optimising minifiers are for! – Dave Aug 21 '13 at 22:43
  • Deleted the silly comment. You are right but we are looking at code generated by an optimiser. So it should already be optimized and my 1st comment was trying to tell the advantages of that form. Your answer only said what it was. Not why. I didn't bother with another answer because yours is pretty good. – Lee Meador Aug 21 '13 at 23:09
2

There's a number of different questions rolled into one, but considering the question title I'll just focus on the first here:

Math.floor(Math.random() * 2147483648).toString(36);

In actual fact, this doesn't do anything - as the value is discarded rather than assigned. However, the idea of this is to generate a number between 0 and 2 ^ 31 - 1 and return it in base 36.

Math.random() returns a number from 0 (inclusive) to 1 (exclusive). It is then multipled by 2^31 to produce the range mentioned. The .toString(36) then converts it to base 36, represented by 0 to 9 followed by A to Z.

The end result ranges from 0 to (I believe) ZIK0ZI.

As to why it's there in the first place ... well, examine the slide. This line appears right at the top. Although this is pure conjecture, I actually suspect that the code was cropped down to what's visible, and there was something immediately above it that this was assigned to.

Adrian Wragg
  • 7,311
  • 3
  • 26
  • 50
  • A random alphanumeric string. – Lee Meador Aug 21 '13 at 22:31
  • @LeeMeador Yes, but not an evenly distributed one; the chance, for example, of the second character being 'B' is greater than its chance of being 'Q'. – Adrian Wragg Aug 21 '13 at 22:33
  • Yes about the 2nd character. Because 36^6 > 2^32 (2176782336:2147483648) and so we don't get to ZZZZZZ base 36. But I would say that the number (as a whole) is evenly distributed between 000000 and ZIK0ZI (If that's what 2 odd billions is in base 36). Each individual "36-it", if we could call them that, doesn't end up with each value with the same frequency over time but that is typically true for random numbers in, say base-10, and the digits don't have equal probability of being any given value 0-9. – Lee Meador Aug 21 '13 at 22:50
2

1) I have no idea what the point of number 1 is.

2) Looks to make sure that any symbols are properly converted into their corresponding HTML entities , so yes basically sanitizing the input to make sure it is HTML safe

3) void 0 is essentially a REALLY safe way to make sure it returns undefined . Since the actual undefined keyword in javascript is mutable (i.e. can be set to something else), it's not always safe to assume undefined is actually equal to an undefined value you expect.

Evan
  • 5,975
  • 8
  • 34
  • 63
  • About #3: Good insight. But what about void? Since void is *not* mutable, why void 0? Wouldn't just void suffice? – rodrigo-silveira Aug 21 '13 at 22:34
  • @rodrigo-silveira `void` is a [unary operator](http://es5.github.io/#x11.4.2) and requires an *Expression*. `0` is used because number primitives are immutable, so it won't waste too much in using it with the result of the *Expression* being discarded. – Jonathan Lonowski Aug 21 '13 at 22:36
  • @rodrigo-silveria jonathan nails it. in fact, you could put anything there instead of 0 (string, other number, even function i think), and it will still evaluate the same. using 0 is probably just the easiest/maybe a little more recognizable. – Evan Aug 21 '13 at 22:37
2

1) This code is pulled from Closure Library. This code in is simply creating random string. In later version it has been replaced by to simply create a large random integer that is then concatenated to a string:

'closure_uid_'  + ((Math.random() * 1e9) >>> 0)

This simplified version is easier for the Closure Compiler to remove so you won't see it leftover like it was previously. Specifically, the Compiler assumes "toString" with no arguments does not cause visible state changes. It doesn't make the same assumption about toString calls with parameters, however. You can read more about the compiler assumptions here:

https://code.google.com/p/closure-compiler/wiki/CompilerAssumptions

2) At some point, someone determined it was faster to test for the characters that might need to be replaced before making the "replace" calls on the assumption most strings don't need to be escaped.

3) As others have stated the void operator always returns undefined, and "void 0" is simply a reasonable way to write "undefined". It is pretty useless in normal usage.

John
  • 5,443
  • 15
  • 21
  • Insightful. My main question remains, though: what is the purpose of generating that string, if it doesn't seem to get assigned to anything, and doesn't seem to be used anywhere in the script??? **What is that string used for?** – rodrigo-silveira Aug 22 '13 at 22:19
  • 1
    If it isn't assigned to anything, it isn't used for anything. The compiler wasn't able to determine that "toString(36)" didn't do anything interesting. Everything else is to feeds into that. If the implementation looked like this "Something.prototype.toString = function(a) {alert(this + a)}". You could see that the compiler would need to preserve it. In the original code, the string is used in cases where different versions of the library that were running on the same page and might conflict but those cases are exercises. It is obviously dead code to us but not to the Closure Compiler. – John Aug 23 '13 at 03:32
  • that should have been "but those cases are NOT exercised". – John Aug 27 '13 at 00:58