4

At the end of the source code of a Goodreads page I was visiting, there is the following code, after the closing </html> tag:

...
</body>
</html>

<!-- This is a random-length HTML comment: ocufpknrqrhggkynniqfuunofiuufunhjtvapgfyvsxfvvvbzfwkhqfazmhydbqfqvymamwthwllkpxvkjqssgqopoiozifoxillqstontzzzmtwkjbmmwfejssorsfxixtsxgcrzuhiuhjnfczeprcmnieowarxsjkpojgjwlecvuitlenftpreqovysmfmjgtjsxingjkgqnjmtugnzbfsyrynrxkmjjcowffwkbmjlwqqbatwdzlhzzlbhfwiugmnezcahpxpsdaoljnpgfxgglcyiqvgyocrclrgpelgzjbdkcnvudiopkhwkiyghooichcafzjduixdqtkktymvdpmjrheiurooozutdbuoalrhwmmvlwbutrovxfwfkkwbvzppivfipkgoimpymmvixdiyvlapjxiqqgrohlibleuzpxdrmrfclrtdyxrtmldqusmvypkkssxibaxynxomxoxmrvmrweorjmehqrsbxebgijcychltpiapnuoxlhhlhirkrwmfnwvntdscnlikiczqvgpmpsiwkudnioehxnqlbtlwzqvnbbgpyngdnjqydtyxqfphrdcvidpdkcdbtdkfgermhgjhlajhlliktyujtchswfvvdjjxqqjmkfojlsdgozixmhpeaeozguqnnzpsbfzaxvmreqvjbygrbwoeheuzabjrcfxqiugqneeondxtppqfkbvwkcjcqlixrqzhfocaezrzxhkvwotraniyuireggwjegzblwbygqjywdaxcmvzlkpfrzluhgigjyyspvnfcrlbgjicxpahpikcvfhbuiwfgoajcicjomijozrisrtyicucbfqczyvpjlmlxemibangnvyeboattdcpveemtydcowutgegwckzsitkrttkspzxzbcn -->

What is the purpose of this?


When I search for the phrase "This is a random-length HTML comment", I get results relating to breach-mitigation-rails and SendGrid API v3.

  • Probably some debug output that has remained. – BenM Dec 17 '18 at 15:28
  • 1
    It *could* be anything…!? Even if there's some *common* purpose for that, it could just be some random programmer's fever dream in this particular instance. – deceze Dec 17 '18 at 15:28
  • 1
    Depening on the actual content of the page, this could have been used to make sure that the entore page comes in at > 512 bytes. IE used to have an issue with serving up lages that were smaller than that... Though i think that was error page specific... – Stuart Dec 17 '18 at 15:30
  • 2
    This isn't an unreasonable question, but it may only answerable by whoever at goodreads did it in the first place. (It's on every one of their pages so it's clearly intentional, but it certainly isn't common practice.) My best guess is it's a SEO attempt, to make the page contents appear to search engines to change frequently; but that's only a guess, and I have no idea if it would even work. – Daniel Beck Dec 17 '18 at 15:31
  • 1
    I stand corrected, apparently it *is* common practice! That's a good find, @user57423 -- you should go ahead and post it as a real answer. – Daniel Beck Dec 17 '18 at 15:39
  • 1
    See http://breachattack.com; this is #6 under "mitigations". – Daniel Beck Dec 17 '18 at 15:46

1 Answers1

3

It seems to be related to mitigating the BREACH attack:

3.1. Length Hiding. The crux of the attack is to be able to measure the length of the ciphertext. So, a natural attempt at mitigation is to hide this information from the attacker. It seems as though this should be simple and easy; one can simply add a random amount of garbage data to each response. Surely then the true length of the ciphertext will be hidden.

However:

While this measure does make the attack take longer, it does so only slightly. The countermeasure requires the attacker to issue more requests, and measure the sizes of more responses, but not enough to make the attack infeasible. By repeating requests and averaging the sizes of the corresponding responses, the attacker can quickly learn the true length of the cipher text. This essentially boils down to the fact that the standard error of the mean in this case is inversely proportional to N, where N is the number of repeat requests the attacker makes for each guess. For a discussion of the limits of length-hiding in a slightly different context, see [7]. We also comment that there is an IETF working group developing a proposal to add length-hiding to TLS [6].

http://breachattack.com/resources/BREACH%20-%20SSL,%20gone%20in%2030%20seconds.pdf

deceze
  • 510,633
  • 85
  • 743
  • 889