1

I want to track the visitors for each of my users, for each day.

 date | user_id | US | CA |...
 __________________________
      |         |    |    |

My query is

UPDATE mytable SET :country = :country + 1 WHERE user_id = :user_id AND date = :date

Is there a better way to store the users visitor stats for each country per day? The query will be performed 1 million times a day, so it should be lighweight.

Your Common Sense
  • 156,878
  • 40
  • 214
  • 345
Veeza
  • 63
  • 5
  • 2
    Don't store things in columns like that. Create a separate row for each each country. – Gordon Linoff Sep 26 '15 at 12:52
  • 2
    A properly normalized database would eliminate this problem – Mark Baker Sep 26 '15 at 12:52
  • @MarkBaker In fact the table is more complex, this country stats exist for each user, for each day. So there are 2 more columns "user_id" and "date", just to create a row "US" or "CA" wouldn't work in this case. – Veeza Sep 26 '15 at 13:39
  • Agree. Also you might want to reconsider this "30 most important countries". When it comes to metrics of a user base I suggest making things more...flexible. Are those 30 hard coded or is your statistics gathering adaptive that is if the site experiences a shift in the user base (example: Canada stops being important due to a very limited number of users from that country but Congo suddenly starts showing a great interest in your content) those 30 countries will be rearranged? – rbaleksandar Sep 26 '15 at 13:43
  • This 30 countries will be more than enough to cover the most important ones, even if there are many visitors from Congo, it is not important at all because the traffic is worthless and users will not get paid for it more than a few cents. @rbaleksandar – Veeza Sep 26 '15 at 13:46
  • There are more then 30 big economies around the world, you do know that? Congo was just an example for a shift in usage statistics. Just saying that making this hard coded means that in the long run your site might experience some problems depending on the development of the world. :D – rbaleksandar Sep 26 '15 at 13:48
  • Yes, but what other design would you prefer? This stats will be updated more than 1 million times a day, so the structure should be as lightweight as possible. I could also create columns for all 170 countries. @rbaleksandar – Veeza Sep 26 '15 at 13:51
  • @GordonLinoff please check my edit and my comments. – Veeza Sep 26 '15 at 13:53
  • This is actually a big topic involving a somewhat in-dept knowledge of algorithms and statistics for usage stats tracking. I advice you to read more about that. As I said - you can leave things as they are (only 30 countries to track) but if at some point an "important" country becomes "unimportant" your statistics would become polluted and you will have to manually analyse all that and rewrite your code, tables etc., which might be less worth doing compared to sparing some time on research how to make things adaptable. – rbaleksandar Sep 26 '15 at 13:59
  • @rbaleksandar Yes I know, I have already checked other php tracking projects like piwik. But it is only capable of up to 500k visitors /day, after that the load is just to heavy. I am not sure how a better solution would look like. I have also noticed a script which creates a new row for each visitor, so there is only a "country" column. But that would make 1 million new rows per day (!). So this attempt is the best I have come around with and seen so far. – Veeza Sep 26 '15 at 14:06
  • @MichaelDibbets I don't see how such a comment helps anyone, please describe how you would design such a database or don't comment at all. – Veeza Sep 26 '15 at 14:35
  • 1 million times per day is an average of about 12 times per second. Even assuming a peak load of twice that, 120 operations per second is not very much for a modern database. – Gordon Linoff Sep 26 '15 at 22:23

1 Answers1

1

You want a data structure like this:

create table UserCountries (
    UserCountyId int primary key auto_increment,
    UserId int,
    Country varchar(3)
);

Just insert rows into the table when you encounter them. Then, pull the data as you like when you query.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • No sorry, in fact the table is more complex, this country stats exist for each user, for each day. So there are 2 more columns "user_id" and "date", just to create a row "US" or "CA" wouldn't work in this case. – Veeza Sep 26 '15 at 13:39
  • @Veeza: then add a column for day. This is definitely doable. It's a well-understood problem and easily solved if you'll just listen to reason. – siride Sep 26 '15 at 14:04