35

What is the best way to generate a 'fingerprint' of user unique-ness in PHP?

For example:

  1. I could use a user's IP address as the 'fingerprint', however, there could be multiple other users on the same IP
  2. I could use the user's IP + user agent as the 'fingerprint', however, a single user could simply swap from safari to firefox and again be seen as being unique

Ideally, the fingerprint so label the 'machine' rather than browser or 'ip' but I can't think of how this is achievable.

Open to ideas/suggestions of how you uniquely identify your users, and what advantages/disadvantages your method has.

So Over It
  • 3,668
  • 3
  • 35
  • 46
  • 2
    old question, but this: `md5(implode('',$_SERVER));` – Shea Sep 27 '11 at 08:26
  • 2
    @Shea Helpful comment but fingerprints don't change (metaphorically speaking) and there's going to be variations in $_SERVER unrelated to the user http://us1.php.net/reserved.variables.server – PJ Brunet Nov 05 '13 at 18:19
  • 3
    @Shea I wouldn't use the trick you were suggesting as-is because the fingerprint would change on each webpage visited. But maybe it can be improved. – Calimero Oct 04 '17 at 14:54
  • 1
    @Calimero Yeah you're right. It's useful, but you need to filter out certain keys. – Shea Oct 05 '17 at 01:58
  • This site covers pretty much every piece of information you can use to distinguish individuals through a browser. https://panopticlick.eff.org/ JavaScript, Java and Flash help a lot. – JAL Nov 03 '10 at 08:39

4 Answers4

10

Easiest and best way: use phps session-management - every client is given an ID, stored in a cookie (if enabled) or given as a get-variable on every link and form (alternatively you could set a cookie on your own). But, this only "fingerprints" the browser - if the user changes his browser, deletes his cookies or whatever, you can't identify it anymore.

Identifying every client by IP address is usually a bad idea and won't work. Clients that use the same router will have the same IP addresses - clients connected through a proxy-pool could have another IP address with every page load.

If you need a solution that can't be manipulated by the client in an easy way, try to do a combination of the following, using all that are supported by the clients browser and compare them on each page-load:

  • "normal" HTTP Cookies
  • Local Shared Objects (Flash Cookies)
  • Storing cookies in RGB values of auto-generated, force-cached PNGs using HTML5 Canvas tag to read pixels (cookies) back out
  • Storing cookies in and reading out Web History
  • Storing cookies in HTTP ETags
  • Internet Explorer userData storage
  • HTML5 Session Storage
  • HTML5 Local Storage
  • HTML5 Global Storage
  • HTML5 Database Storage via SQLite

There's a solution called evercookie that implements all of this.

Christos Lytras
  • 36,310
  • 4
  • 80
  • 113
oezi
  • 51,017
  • 10
  • 98
  • 115
  • 2
    You linked to the German translation of the PHP manual... Here is the same page in English: http://www.php.net/manual/en/book.session.php -- Also, using Evercookie, or any other way of storing data on the client with the explicit purpose of making it hard to remove is likely to be illegal in most countries. – Adrian Schmidt Jan 24 '11 at 14:20
5

There's something else to take in account, the public IP Address of a user is something that also can change in every page load.
There are multiple organizations that switch public IP's in they routers to balance traffic.

jmserra
  • 1,296
  • 4
  • 18
  • 34
3

Achieving 100% reliability is not guaranteed, but combining some common methods can give you meaningful results

  • Users generally don't switch browsers. Over-complication in your algorithm only to reach engineering perfection is not worth the effort.
  • You certainly belong to the top 100 websites if you can expect multiple users from the same IP. Don't take it personal, but you're just not that popular.

Take the simplest possible route that could work and adjust over time if it seems necessary.

pestaa
  • 4,749
  • 2
  • 23
  • 32
  • 19
    i totally disagree with your second point - depending on wich kind of site he has, you'll get multiple users with the same ip very fast. lets say his site is a browsergame: all the schoolkids tell each other, friends play together, and in the informatics class, where they have internet, they 'r all playing this game from the schools network (wich means: same ip, but the developer has to find out if it's the same pc to know this isn't a cheating multi-accounter) – oezi Nov 03 '10 at 08:49
  • @oezi Such a browsergame will get along with session IDs just fine. – pestaa Nov 03 '10 at 08:51
  • 3
    indeed, i just wanted to give an example why you can't rely on the ip-adress even on small sites. – oezi Nov 03 '10 at 08:55
  • 1
    Yes, I wish I was popular with my sites. That being said, the little web app I have runs on a tiny VPS (80Mb RAM - yep!) - so it doesn't take too much to overload it. I need to ensure fair usage for each user. – So Over It Nov 04 '10 at 03:05
  • @So Over It: Having so little resources, I'd concentrate on the core of my app, not the nuts and bolts that can be added later. – pestaa Nov 04 '10 at 08:08
  • @oezi yep, your're right, any "pc houses" with internal networking - office houses, schools, hospitals, malls (free wifi for customers),... – jave.web Jul 02 '18 at 21:59
0

I have three different computers, various handheld devices, and many of them have different browsers installed. I use all these interchangeably at home take them with me other places so, basically, on various IP addresses. What I'm trying to point out is that fingerprinting a browser or a machine for that matter is never going to be foolproof if your goal is to block a person.

I recommend you take a different approach. Judge based on the inconclusive information you have available that suggests the identity of your banned user (same IP or same user-agent if it's a uncommon one or else some of the javascript browser fingerprinting methods such as available fonts, available plugins, non-standard window size, etc.) and require of those suspect visitors some higher form of identity verification -- such as oauth with Facebook, Google+, or Twitter. Then you can look to see if that social media account is genuine or created just to circumvent. There are also phone verification APIs in case your user base isn't social-media savvy and depending on how valuable it is to you that users don't circumvent banning.