17

So this might be a long, long shot, yet I am completely stumped on what might be causing this issue:

I am delivering a client side JavaScript, that parses certain parameters on the page where it is embedded, uses these parameters to construct a URL and inject an iframe using that URL into the page like:

var queryParams = {
  param: 'foo'
  , other: 'bar'
};

is turned into:

<iframe src="http://example.net/iframes/123?param=foo&other=bar"></iframe>

This is working quite fine, I am delivering around 1.5 million requests per day. Yet I recently noticed that in around 3.000 cases per day the values of the query parameters are shuffled, so sth like this gets requested:

<iframe src="http://example.net/iframes/123?param=ofo&other=rba"></iframe>

Judging from the logs this is tied to specific users, and the jumbling of characters will happen anew on each request, so I can see sequences like this when a user is browsing the site with multiple pages using the script:

108.161.183.122 - - [14/Sep/2015:15:18:51 +0000] "GET /iframe/ogequl093iwsfr8n?param=3a1bc2 HTTP/1.0" 401 11601 "http://www.example.net/gallery?page=1" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0"
108.161.183.122 - - [14/Sep/2015:15:19:07 +0000] "GET /iframe/ogequl093iwsfr8n?param=a21b3c HTTP/1.0" 401 11601 "http://www.example.net/gallery?page=2" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0"
108.161.183.122 - - [14/Sep/2015:15:19:29 +0000] "GET /iframe/ogequl093iwsfr8n?param=ba132c HTTP/1.0" 401 11601 "http://www.example.net/gallery?page=3" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0"

The 401 is happening on purpose as the server expects param=abc123.

I also noticed that the majority of errors is happening in Firefox and Safari, not a single erroneous URL has been requested by Google Chrome.

The library I am using for turning the object into a query string is: query-string - but looking at the source code I cannot see any potential for a bug of that kind in there, there's nothing that is done to the values which is not done to the keys (which are not messed up).

Has anyone ever encountered anything similar? Is this some weird browser extension? Is this a collision of my script with another library extending prototypes? Is this malware? Is this something I am completely unaware of? I'd be thankful for any hint because I am really clueless and this is really driving me crazy.

EDIT: I just discovered that another of our public facing services is currently being probed by sth called "Burp Suite". Having a look at their website I see they have a tool called "Payload fuzzing" which seems to do pretty much what is described here: https://portswigger.net/burp/help/intruder_gettingstarted.html or here: https://portswigger.net/burp/help/intruder_using.html#uses_enumerating - The whole tool smells semi-fishy to me, so I this might be something worth investigating further. Has anyone else ever heard of this toolset?

m90
  • 11,434
  • 13
  • 62
  • 112
  • @PaulRoub he is not talking about params order, but the value of each param has been shuffled `param=ofo`. – Walid Ammar Sep 14 '15 at 19:22
  • 1
    Hey, where are the queryParams values grabbed from? If they are grabbed from a web page they can easily be altered by anything from translators to bots. – jjbskir Sep 17 '15 at 15:00
  • @jjbskir they are indeed grabbed from the DOM of the host page so I am aware that they can be messed with - I'd like to know more what is messing with them. The strings are contained in class names and data attributes. Most of them are random alphanumeric strings à la `/[a-z0-9]{32}/i` – m90 Sep 17 '15 at 18:13
  • 1
    Is there any way you can move them out of the DOM? – jjbskir Sep 17 '15 at 19:19
  • 1
    Can you give a few real examples for the messed up key-value params? – Onur Yıldırım Sep 18 '15 at 03:14
  • 1
    @OnurYıldırım one of the parameters is called `accesskey`, containing a 32 char alphanumeric value like `acdeeaa9c89ef9b63cdf62810c25d32c` that gets shuffled into `2a0edd3a6f93ae2c21cc5b9c86dc8e9f` when requesting the iframed document. See this fiddle: http://jsfiddle.net/tvjzsvnq/ for proof that it is the same set of characters. – m90 Sep 21 '15 at 08:26
  • Regarding your edit: Burp Suite is a pretty well-known tool that developers can use themselves to find security issues in their services. It's also used (with permission) by hired security experts who try to find holes in your service. These are valid use cases and the tool can be helpful - and be used in a totally legal way. – MJV Sep 22 '15 at 06:25
  • Continuing my previous comment: However, it can of course be used for attacking someone's site without permission, as is most probably happening here. I would contact the service provider of the attacker (using the IP address which you can see in your logs). (I'm in no way affiliated with Burp Suite but may have used it in the past, if my memory serves me right.) – MJV Sep 22 '15 at 06:33
  • @MJV Thanks for your insights, I'd be fine with someone "testing" that tool against our services as it does handle all this as supposed, I'd just like to know what is really happening so I can rule out a bug on our side. – m90 Sep 22 '15 at 06:51
  • 1
    @m90 Given the info in your question and comments I'm totally with Onur on this one; someone is attacking your service and deliberately messing up the parameters to gain access to sensitive (i.e. someone else's) data. I'd say there's no reason to suspect a bug in your system. I'd probably just make sure that attempts to get data with invalid access keys are logged and perhaps an alert is sent to an admin if there are several such requests in a short time period. (Who can then block access completely from the IP address in question or take some other action.) – MJV Sep 22 '15 at 08:56
  • men in the middle attack – webdeb Sep 23 '15 at 00:26
  • maybe they are on free internet, more and more services rewrite traffic to inject their own ads, xfinitiy wifi for example, and it could be that messing up. – dandavis Sep 23 '15 at 09:52
  • Those requests came via a CDN. Make sure your CDN provider isn't actually doing this. – Michael Hampton Sep 23 '15 at 21:55
  • @MichaelHampton Thanks for mentioning this, I can rule out a CDN problem by now though. – m90 Sep 24 '15 at 09:16

5 Answers5

7

Not much to analyze from this point, and since you're looking for hints; this is more like a long comment rather than an answer.

A malware on the client browser (or machine) or on your web-server; or an unknown crawler could be causing this, which is unlikely. To me, it seems your application is being attacked.

Let's see;

  • The real example (in the comments), shows that 128-bit hexadecimal access keys are being shuffled. (values of accessKey param)
  • Only values get shuffled and not keys.
  • You say, requests are coming from specific users.
  • You say, requests are coming from specific browser clients (Firefox and Safari).

What to check/do;

  • Check if your logging system works properly. If you're using a third-party, configurable logger, this could mess things up. (example)
  • Reproduce: Take the same exact set of parameters; use the same version of browser(s) and see if the results are the same. If so, it could be a browser-version issue, which is highly unlikely.
  • Check if there are other Firefox and Safari users (with same versions) that do NOT experience this.
  • Since you say it's only a small percentage of the requests, check if corresponding requests are made right after another. (Same kind requests in less than a second?)
  • Try tracing the source of the requests. Are they coming from a source you suspect? Can you relate information from different requests to each other? Multiple IPs form a subnet? Same IP using different accounts? Same account using different IPs in a short period of time?
  • There are tools such as apache-scalp, mod_sec, lorg to check/analyze big log files to extract possible attacks.
  • You can also use some of the techniques mentioned here to manually spot or block suspicious requests.
Onur Yıldırım
  • 32,327
  • 12
  • 84
  • 98
6

I am Tomas and I am a Software Engineer at CLIQZ.

We are a German Startup who are integrating search and innovative privacy features into browsers. This is indeed a result of our Anti Tracking feature. A similar question was also asked on reddit and in another question on stackoverflow. It was already answered in both posts, so I will just quote the same answer here:

CLIQZ Anti Tracking is not designed to block tracking in general, but rather only the tracking of individual users — which we consider a violation of our users’ privacy, and therefore inappropriate. Unlike other anti-tracking systems, ours doesn’t block the signals completely; thus, website owners are able to get data for legitimate uses, such as counting visits.

To prevent the identification of users (e.g. by using JavaScript hashes), CLIQZ Anti Tracking does in fact permute strings. . Whenever a new tracker shows up in our data, our system initially treats it as a user-identifying tracker and changes the string to preventively protect our users. Our system uses so called k-anonymity techniques. If it sees the same string for an event with multiple users showing up independently over the course of several days, it puts it on a whitelist of legitimate, non-identifying trackers. Once a tracker is whitelisted, it remains unmodified and website-owners see the original string. In other words, CLIQZ Anti Tracking limits the functionality of legitimate trackers only temporarily. As soon as it becomes clear that a tracker doesn’t violate our user’s privacy, everything works as usual. Privacy is extremely important to us and we believe this technology is necessary to protect our users from snooping.

I hope this helps.

Community
  • 1
  • 1
tomas
  • 963
  • 6
  • 19
  • Hi @tomas, we are using querystring parameters to tint icons on our site, like on some themes it will be red, on some it will be blue. Is there any way that we can workaround this issue for legitimate causes. – Dirk Boer Apr 28 '19 at 11:35
3

As I already mentioned here Google Analytics Event Permutation there is a specific version (at least 1.0.37) of the Firefox add-on "Cliqz" having an anti-tracking-functionality built in.

Community
  • 1
  • 1
M. Röder
  • 119
  • 4
  • Thanks. I can confirm that this extension is what is shuffling our parameters as well. Did you have a look at the extension's code yet? Can you understand when a `token` will qualify as a `badToken` here: https://gist.github.com/m90/e9df0576ac6f06f864f2 ? – m90 Sep 25 '15 at 11:39
  • We were happy when we found the actual line of code doing the shuffle. I don't think that we want to do more investigations... – M. Röder Sep 25 '15 at 11:46
  • I think I got what is happening in our case: the script checks querystring tokens, and when it encounters a duplicate value above 8 chars it assumes this is a tracking identifier: https://gist.github.com/m90/e9df0576ac6f06f864f2#file-badtokens-js-L36 - too bad it's an API key passed via GET in our case... – m90 Sep 25 '15 at 18:38
  • @m90 It only assumes it's an identifier until it sees the same token from other users. After that, your tracker will be added to a white list and you will receive normal data. So yes, it can technically intercept signals to legitimate trackers that do not send user identifiers, but only temporarily. – tomas Oct 16 '15 at 11:17
0

It seems highly unlikely to me that this behaviour has roots in either your or the query-string code. Given that query string values can be freely altered, I suspect this is what is occurring - bare in mind that this is 0.2% of your requests.

There are a couple of things I would check. Are you aware of whether these requests are referred from other websites, your own website, or made directly? Are you aware of whether any of the source IPs correspond to known bots or web crawlers? Are the requests from a variety of sources or a small subset of repeated visitors?

It is possible that a bot or web crawler is "lightly probing your site" or testing for duplicate pages or misleading parameters.

Ninjakannon
  • 3,751
  • 7
  • 53
  • 76
  • Shouldn't bots and crawlers identify themselves with their UA String? All the logged requests use real browser's UA strings. – m90 Sep 20 '15 at 06:01
  • Good thought; so we rule out obvious web crawlers (unless there's something odd going on, but you should be able to check the IPs). It's not definitive for other bots though as you can set the UA string - e.g. a bot searching for vulnerabilities isn't going to name itself. – Ninjakannon Sep 20 '15 at 10:46
0

Some robot crawls your site, it is quite normal. If you don't want him to load your server, block the request IP.

Eugene Tiurin
  • 4,019
  • 4
  • 33
  • 32
  • Thing is, it seems to be a regular Browser (judging from the UA String) and all the traffic is coming from different IPs. So it's very unlikely this is just "some robot". – m90 Sep 23 '15 at 05:49
  • UA Strings can easily be faked so it's not a really good metric to go by. – Armando Canals Sep 24 '15 at 02:50