0

Let's say that I have 3 different headlines for an article:

  • "Man Bites Dog"
  • "This Man Unhinged His Jaw as He Approached A Dog, What Happens Next Will Shock You!"
  • "Only 90's Kids Will Remember That Time a Man Bit a Dog"

I want to use PHP to randomly display one of these three headlines based on the current user (so they're not getting new headlines each time they refresh), then record the number of clicks for each version of the headline via SQL where I get something similar to:

USER HEADLINE CLICK?
1    1     No
2    3     Yes
3    2     Yes
4    3     No
5    2     Yes
6    1     No

Specifically, I'd like advice about:

- Retrieving some sort of variable that's unique to the user (IP address, maybe?)
- Randomly assigning a number (1-3, in the example) based on that unique user variable.
- Displaying different text based on the assigned number.

I can figure out the SQL stuff once I figure this part out. I appreciate any advice you can provide.

2 Answers2

1

You have three problems here:

  1. How to identify user constantly
  2. How to count user clicks(actions)
  3. How to get result statistics

Here I think that showing different subjects on one page is not a problem


Problem 1

Basically you can use an IP address but it is not a constant id for user. For example if user uses mobile phone and walks, he can switch between towers or loose connection and then restore it with different ip.

There are many ways to identify user by the web, but there is no way to identify user on 100% without authorization (active action done by user)

For example you can set Cookie to user with his generated ID. You can easily generate id you can look here. When you set up cookie and user will come back to you, you will know who it is and do the stuff you need.

Also within user uniqueness you can reed this article - browser uniqueness

Also if you use Cookie, you can easily store there subject id for your task. If you will not use Cookie i recommend you use mongodb for this kind of tasks (many objects with small data, that must be retrieved from db very fast, inserted to db very fast and there are no updates in your case)


Problem 2 You showed table that has 3 fields: ID, Used title, Is title clicked.

In this kind of table you will lose all not unique click (when user clicks on subject twice, goes there tomorrow or refreshes target page multiple times)

I suggest you to use following kind of table

  • ID - some unique id, auto increment field will be good here
  • Date - some period of measurements (daily, hourly or something like that)
  • SubjectID - id of subject that was shown
  • UniqueClicks - count of users that clicks on subject
  • Clicks - Total count of clicks on subject

In this case you will have aggregated data by period of time and you will easily show data in admin panel

But still we have problem with collecting this data. Solution of this problem depends on count of users. If there is more than 1000 clicks in minute, I think that you need some logging system. For example you will send all data to file 'clickLog-' . date('Ymd_H') . '.log' and send data to this file in some static format, for example:

clientId;SubjectId;

When hour is end you can aggregate this data by shell script or your code and put it to db:

cat clickLog-20160907_12.log | sort -u | awk -F';' '{print $2}' | sort | uniq -c

after this code you will have 2 columns of data. First will be count of unique clicks and second will be subject id

Modifying this script you can get total clicks with just removing sort -u section

Also if you have several subject ids you can do it with for:

For example bash script for unique clicks can be following

 for i in subj1 subj2 subj3; do

    uniqClicks=$(cat clickLog-20160907_12.log |
        grep ';'$i'$' | 
        sort -u |
        wc -l);
    clicks=$(cat clickLog-20160907_12.log |
        grep ';'$i'$' | 
        wc -l);

    # save data here
 done

After this manipulations you will have prepared aggregated data for calculating and source data for future processing (if needed) And also your db will be small and fast and all source data will be stored in files.


Problem 3

If you will do solution in Problem 2 section, all queries for getting statistic will be so simple, that your database will do it very fast

For example you can run this query in PostgreSQL:

SELECT 
    SubjectId, 
    sum(uniqueClicks) AS uniqueClicks, 
    sum(clicks) AS clicks
FROM 
    statistic_table
WHERE 
    Date BETWEEN '2016-09-01 00:00:00' and '2016-09-08 00:00:00'
GROUP BY
    SubjectId
ORDER BY 
    sum(uniqueClicks) DESC

in this case if you have 3 subject ids and hourly based aggregation you will have 504 new rows in weeks (3 subjects * 24 hours * 7 days) that is really small amount of data for database.


Alternatives

You can also use Google Analytics for all calculations. But in this case you need to do some other steps. Most of them are configuration steps that need to be done to enable google analytics monitoring scripts on your site. If you have it, you can easily configure goals support and just apply to script additonal data with subjectid by using GA script api

Community
  • 1
  • 1
Spell
  • 8,188
  • 2
  • 16
  • 19
0

You can use the IP of the user or his MAC, if the user is registered on the web you can use the user id. For the second part you can use the function mt_rand() for PHP:

mt_rand(min,max) -> if you want a number bettween 1 and 3 user mt_rand(1,3);

the use an array to store the three diferent headlines and use the ramdomly generated number to acces the array.

Better you can generate a number bettween 0-2 because the arrays start with 0.

  • Please do you mind posting a code snippet how op could capture the mac ? and detail a bit what do you mean by "the user is regestered on the the web" ? – Gar Sep 06 '16 at 18:17
  • There is a post about the mac adress: http://stackoverflow.com/questions/1420381/how-can-i-get-the-mac-and-the-ip-address-of-a-connected-client-in-php . About the user is registered on the web, if you had a login screen the data of the user will be on the BD and every user have an unique ID. – Grabthesky Sep 07 '16 at 08:20
  • That's why i asked for clarification, as the client mac address is not something you could rely on (dixit the post you provided : The client MAC address will not be available to you except in one special circumstance: if the client is on the same ethernet segment as the server.), and the IP might be shared (nowadays i hear about countries sharing same ips) , – Gar Sep 07 '16 at 11:05