How do you detect that a visitor changed a value in the query string?

Question

For our last week in school (finals next week) our teacher decided to give us a crash course in Perl. We talked about all the differences we would encounter if we used Perl and then we started talking about "spoofing".

We were given an HTML example where a user could input their first and last names. Of course our example already had Mickey as the first name and Mouse as the last name.

<form action="action_page.php">
First name:<br>
<input type="text" value="Mickey">
<br>
Last name:<br>
<input type="text" name="lastname" value="Mouse">
<br><br>
<input type="submit" value="Submit">
</form>

At the end when you hit submit you were redirected to a new screen that said your first name is Mickey and your last name is Mouse.

Our teacher said "spoofing" is when you change the method = get in the URL so instead of having

firstname=Mickey&lastname=Mouse

you would enter something like

firstname=baseball&lastname=bat

That would instantly alter the intended command and you would end up getting first name as baseball and lastname as bat.

This all sounds pretty straight forward, until he said he wanted us to write a program to prevent spoofing without using a post method.

Instead when a user attempts to spoof the system we're supposed to print out some anti-spoofing comment.

Unfortunately, we never really talked about spoofing aside from the examples. I've attempted to Google spoofing to see some example code, or at least understand this concept, but I haven't had much luck, or I haven't looked in the right places.

So I thought I would ask here. Can someone who is decent at Perl direct me towards basic anti-spoofing programs and content, or at least explain and show how spoofing is supposed to work.

Welcome to Stack Overflow. Please read the [About] page soon. I'm puzzled — this looks more like HTML and form management than Perl. — Jonathan Leffler, May 04 '15 at 02:22
Thank You, I will. - Our teacher called his lesson "perl and the web", its a new and completely confusing to me. — Jason, May 04 '15 at 02:26

score 3 · Answer 1 · edited May 23 '17 at 11:43

What you need to do is to authenticate the data in the query string, and validate it when you receive it. There is a standard tool(set) for this: a cryptographic Message Authentication Code (MAC).

Basically, a MAC is a function that takes in a message (any arbitrary string) and a secret key, and outputs a random-looking token that depends, in a complicated way, on both the message and the key. Importantly, it is effectively impossible to compute a valid MAC token for a modified message without knowing the key.

To validate a query string (or some other data) with a MAC, you'd basically follow these steps:

Encode the data into a "canonical" form as a string. For an HTTP URL, you could just use the query string (and/or the entire URL) as it is, although you may wish to normalize it e.g. by %-decoding any characters that don't have to be encoded, and normalizing the case of any %-encoded values (e.g. %3f → %3F).

Alternatively, you could decode the query string into, say, an associative array, and serialize this array in a format of your choice. This can make it easier to combine parameters from multiple sources (e.g. hidden form fields), to add extra data fields (see below) and to choose which fields you want to validate.
Optionally, combine the data with any additional information you wish to associate it with, such as a user ID and/or a timestamp. (You can either transmit the timestamp explicitly, or just round it down to, say, the last hour, and check both the current and the previous timestamp when validating it.) Changing any of these values will change the MAC output, thus preventing attackers from e.g. trying to submit one user's data under another user's account.
Store a secret key (preferably, a securely generated random value of, say, 128 bits) on the server. Obviously, this secret key must be stored so that users cannot access it (e.g. by guessing the path to the config file).
Feed the canonically encoded data and the secret key into the MAC algorithm. Take the result and (if your MAC library doesn't do this for you) encode it in some convenient matter (e.g. using the URL-safe Base64 variant).
Append the encoded MAC token as an extra parameter in the URL.
When you receive the data back, remove the MAC token, feed the rest of the data back into the MAC generation code as described above, and check that the resulting MAC matches the one you received.

MAC algorithms can be constructed from cryptographic hash functions like MD5 or SHA-1/2/3. In fact, a basic MAC can be obtained simply by concatenating the secret and the message, hashing them, and using the result as the token.

For some hash functions, like SHA-3, the simple MAC construction described above is actually believed to be secure; for older hash functions, which were not explicitly designed with this use in mind, however, it's safer to use the (slightly) more complicated HMAC construction, which hashes the input twice.

Alternatively, there are also MAC algorithms, such as CMAC, which are based on block ciphers (like AES) instead of hash functions. In some cases (e.g. on embedded platforms, where a fast hash function may not be available) these may be more efficient than HMAC; for a web application, however, the choice is essentially a matter of taste.

score 2 · Answer 2 · answered May 04 '15 at 02:43

2

One difference between GET and POST is that the information for the former is passed in the URL itself. That means you can type what you like in the browser's address bar -- it doesn't have to have come from an HTML form. I think that's what is meant by spoofing here.

The most obvious protection is to calculate a CRC of all the protected fields -- in this case MickeyMouse -- and put that value in a hidden field of the HTML form sent out by the server. Then, when the request comes back, calculate the CRC of the same fields and check that it matches the value of the returned hidden field.

Of course that can be circumvented if the user works out how the protection functions and adds his own calculation of the CRC of his spoofed data as well. But this should be sufficient for a proof of concept.

answered May 04 '15 at 02:43

Borodin

126,100
9
70
144

The problem mentioned in your last para can be avoided by using a salted hash rather than an unsalted one. The attacker would have to know the secret salt to spoof. – ikegami May 04 '15 at 05:26
I'm not sure why you suggest CRC as the hashing algorithm; it's just as easy using a far better one. – ikegami May 04 '15 at 05:29
@ikegami: The OP says *“he wanted us to write a program to prevent spoofing without using a post method”* – Borodin May 04 '15 at 05:39
1

Missed that! Deleted the comment. – ikegami May 04 '15 at 05:41
@ikegami: I chose CRC because this is intended only as an explanation of the principle and, as they say *Perl and the Web* is *“new and completely confusing”* I imagined that the OP has a better chance of understanding *CRC* than *hash function*. Were this a real application I would probably go for, as you say, salted MD5 – Borodin May 04 '15 at 05:45
Ack! Just as bad. Use SHA, please! – ikegami May 04 '15 at 06:03
2

@ikegami It's not "just as bad". MD5 is a proper cryptographic hash. While it has known weaknesses, it is significantly better than a simple CRC, and quite possibly good enough for this operational scenario. I agree that for educational purposes, a stronger hash would probably be a better example, but you are seriously overreacting here IMHO. – tripleee May 04 '15 at 07:33
MD5 collisions can be generated in under a second on a basic desktop. MD5 shouldn't be used anymore. So yeah, it's just as bad. – ikegami May 04 '15 at 14:08

score 0 · Answer 3 · edited May 04 '15 at 05:30

If you want to detect if a user has changed a parameter in the querystring of a url after a form has performed a GET action, then generate a client side hash before the form is submitted. The hash would be based on the values of the form fields, and then compared to a recalculated hash based on the current parameter values on the response page. If the hashes don't match the querystring has been tampered with.

Here's a client side Crypto library to calculate the hashes https://code.google.com/p/crypto-js/

Note this is only for educational use, and wouldn't provide enough security in the real world, as a person could also discover the hashing key by inspecting the page source and use that to generate their own hashes.

Almost. A client-side library won't help. He can just as easily change the value by changing the field. You'd want to hash the values (salted with a secret) on the server and include the salt in a hidden field in the form. The server would check for tampering by checking the hash of the submitted value with the submitted hash. — ikegami, May 04 '15 at 05:31

score 0 · Answer 4 · answered May 06 '15 at 15:05

A POST method wouldn't prevent spoofing anyway. POST and GET do almost exactly the same thing - they send plain text encoded variables to a web server.

They're insanely easy to "spoof" - the point isn't the spoofing, it's that you shouldn't trust "user input" like that, ever.

I would suggest in the case of the names, it doesn't matter. So what if I fudge your web page to "pretend" I am called "baseball bat" instead?

If it's important, like for example, ensuring I can only see my test results - then you need to handle the data processing server side. One method of doing this is via session tracking - so rather than including field in a web form, I instead use a "session token".

You would 'send' me a username and password - ideally using a hash to make it impossible to 'see' as you're sending it, or in your browser history. And then I would check it against my server, to check if that hash is 'valid' by performing the same operation on the server, and comparing the two.

So perlishly:

#!/usr/bin/perl
use strict;
use warnings;

use Digest::SHA qw ( sha1_base64 );

my ( $firstname, $lastname ) = qw ( Mickey Mouse ); 
my $timewindow = int ( time / 300 ); 
my $token = sha1_base64 ( $timewindow.$firstname.$lastname );

print $token;

This produces a token that doesn't last long - it changes every 5 minutes - but it's extremely difficult to tamper with.

The reason for including the time, is to avoid replay attacks, whereby if look in your browser history, I can find "your" token and reuse it. (That's probably the next question after the "spoofing" one though :))

If you sent the parameters with the token, bear in mind that it's actually quite easy for a malicious actor to perform the same calculation themselves, and send some completely faked credentials and tokens.

This is something of a simplistic example though - because really, faked parameters shouldn't matter, because you shouldn't trust them in the first place. If 'Mickey Mouse' is valid, and 'baseball bat' isn't, then your server should detect that when processing the form, and discard the latter, which makes the whole 'form spoofing' thing irrelevant.

score 0 · Answer 5 · answered May 06 '15 at 15:19

The question is rather narrowly phrased, so this answer might not quite address what you're asking. But as a matter of policy, if you don't want your users to tamper with your data you should not give them custody of it. Why are you relying on the query string for the user name if the server already knows it? Rely on the client for authentication and for new information, and rely on your records for any information that should stay beyond the user's control.

POST requests can be crafted almost as easily as GET requests, and cryptographic protection, even when it is secure, is only useful to the extent that the client cannot access the encrypted data; so why transmit it back and forth?

How do you detect that a visitor changed a value in the query string?

5 Answers5