2

I have a table temp_views which stores temporary views (for an hour) based on the ID of a post, the IP of the user and their user agent string.

SELECT COUNT(*) as viewed 
FROM temp_views 
WHERE showcase_id=:showcase_id AND ip=:ip AND user_agent=:user_agent

Both the IP and the user agent are stored in the table as simply TEXT. (Should I be doing that another way?).

If viewed is 0, then it adds a view - otherwise it does nothing.

What I'm wondering is, can this be "gamed" at all? If the IP matches but the user agent doesn't, then it will add a view. Can the user agent be spoofed easily or something? (A view is added asynchronously with Ajax as the user scrolls down the page).

Additionally, is this slow/inefficient or is it okay?

frosty
  • 2,779
  • 6
  • 34
  • 63
  • 2
    Opera browser and/or Safari browser (I forget which) can choose to view things providing other user-agent tags, there are also various addons for firefox that do the same thing, where you can choose what sort of user agent string you send to the servers you interact with. What is harder to distinguish is what percentage of your userbase are likely to be using these abilities? And that defines if it's worth the time to care about it... – Martin Feb 27 '16 at 00:00
  • 2
    You should generally use `VARCHAR` rather than `TEXT` unless the size is very large. – Barmar Feb 27 '16 at 00:01
  • @Barmar what length do I specify the `VARCHAR`? – frosty Feb 27 '16 at 00:04
  • Whatever maximum size you want to allow. – Barmar Feb 27 '16 at 00:06
  • If I set `VARCHAR` to 10 and the IP is > 10 characters, then what? I just use `TEXT` because 60k+ characters or whatever it has is probably safe. – frosty Feb 27 '16 at 00:08
  • The maximum size of an IPv4 address is 15, the maximum size of an IPv6 address is 23. – Barmar Feb 27 '16 at 00:13
  • @Barmar, okay thanks. – frosty Feb 27 '16 at 00:21

3 Answers3

2

There's actually no 'for-sure' way to check for unique views.

For IP Addresses, users can modify their IP Address using VPN, proxy server, etc.

For User Agents, users can define their own user agent, usually in Developer Options. Even in iOS, there's a Request Desktop Site in Safari.

For storing Cookies, users can also delete the cookies thus each page visited will result in 1 view count.

Thus, there's no foolproof way to detect unique visitors.

One way is to store a Cookie in the user's browser, thus if the Cookie is found, the IP will not be recorded. Even though users can delete the Cookie, usually people won't do that. It would be more accurate than using IP only.

Panda
  • 6,955
  • 6
  • 40
  • 55
  • I don't mind that they can cheat that way, only if they can increase the number of views absurdly so it affects the hotness score of the post (assuming views are taken into account) – frosty Feb 27 '16 at 00:15
  • @frosty Usually, people won't do that, thus using IP is still the best way in PHP. The view count will be a gauge of the actual view counts. Another way is to use external tools such as Google Analytics? – Panda Feb 27 '16 at 00:17
  • @frosty To make it more accurate, you can also store a Cookie first, thus if users use a VPN, or change their IP, you'll detect the Cookie and not track their IP Addresses – Panda Feb 27 '16 at 00:19
  • I'm pretty sure Google Analytics mainly uses a cookie. – Barmar Feb 27 '16 at 00:24
  • Wait, if there's no IP stored they could run a script to add infinite views by continuously deleting the cookie couldn't they? I think IP-only is the best bet here... – frosty Feb 27 '16 at 00:25
  • @frosty I mean for a normal user, you can check the Cookie first, if it's not present then get the user's IP, else don't do anything. This method will prevent duplicate view counts if user is using more than 1 IP, such as using WiFi and 4G. Not many people will try to get around the tracking by using VPN just to increase the visitor counts ;) – Panda Feb 27 '16 at 00:27
1

You should not rely on user agent, it is sent by the client, it can be defined by the user in the browser, the cURL client, etc. One user who would like to increase this counter may perform an infinite number of requests with random user agents, even from the same IP address, and fill your database with fake data.

For example it can be done with a Firefox add-on.


I suggest to use only the IP address in order to distinguish visitors. It's much harder to change the IP address (using a VPN is not easy) than change the user agent. The numbers will be close to the number of actual visitors.

Community
  • 1
  • 1
A.L
  • 10,259
  • 10
  • 67
  • 98
  • I thought so. Problem is if there's multiple people on the same IP address then it won't count the views. How does Stack Overflow do it? – frosty Feb 27 '16 at 00:05
  • @frosty I don't know how Stack Overflow do, sorry. – A.L Feb 27 '16 at 00:06
  • @frosty The usual way to do this kind of thing is with cookies. – Barmar Feb 27 '16 at 00:07
  • How does the user agent help you distinguish different people on the same IP? If they're both using the same browser, the user agents will be the same. – Barmar Feb 27 '16 at 00:08
  • @Barmar Cookies can be altered too. The user can erase the cookie at every request and each visit will be counted. – A.L Feb 27 '16 at 00:09
  • 1
    Right, there's no foolproof method other than requiring users to authenticate. – Barmar Feb 27 '16 at 00:09
  • What's the best way then? Just IP? Whatever Stack Overflow does seems to work, so I could just copy them. – frosty Feb 27 '16 at 00:10
  • You can use only the IP address. It's much harder to use another IP than another user agent. The numbers will be close to the number of actual visitors. – A.L Feb 27 '16 at 00:14
1

The only way to distinguish different users that can't be spoofed easily is by requiring them to authenticate.

The usual way to track users is with a cookie. Users can delete cookies, so it's not perfect, but they generally don't.

When the user connects, check whether the tracking cookie is set. If it's not, generate a random string, set this as the cookie value, and insert it into the database. If it's set, search for a row in the database with this value.

Barmar
  • 741,623
  • 53
  • 500
  • 612