1

For a summer internship, I am asked to collect some specific data relative to the pages a user visits on the startup's website.

In order to simplify things, we can consider the website as a dating site, where each user has its profile page and is tagged under certain categories (hair color, city, etc).

I would like to know the best way, in the Rails framework, to keep traces of each visits a user makes to a profile or to a tag page. Should it be logged in a file or added in a database, where exactly in the code should the functions be called ? Maybe a gem already exists for this specific purpose?

The question is both about where functions should be called in Rails and how data should be stored because the goal is to build a recommendation system, ultimately.

Michael Durrant
  • 93,410
  • 97
  • 333
  • 497
ldavin
  • 423
  • 2
  • 5
  • 16
  • just a tip: data like this is already stored in the logs of your webserver I think – Ben May 09 '12 at 17:58
  • The webserver logs which profiles get visited but not which tags are relevant for each particular visit. So there is some app logic needed either way. Also the data store for the recommendation system should be chosen wisely. – Matt May 09 '12 at 18:14
  • 1
    Maybe you should use Google Analytics? It's pretty good at tracking all this kind of data, and there's an API you can use to pull the data it collects back down into your app/database for whatever processing you end up wanting to do on it. – MrTheWalrus May 09 '12 at 18:34
  • @MrTheWalrus Do you have a link to a page where something comparable is implemented with analytics ? Thanks, I have trouble finding info on this subject... – ldavin May 26 '12 at 09:31

1 Answers1

1

There are a wide range of options available to you. I'd recommend one of the following:

  • Instrument detailed logging of the relevant controller actions. Periodically run a rake task that aggregates data from the log files and makes it available to your relevance engine.
  • Use a key/value store such as Redis to increment user/action specific counters during requests. Your relevance engine can query this store for the required metrics. Again, periodic aggregation of metrics is advised.

Both approaches lend themselves well to before_filter statements. You can interrogate the input params before the controller action executes to transparently implement the collection of statistics.

I wouldn't recommend using a relational database to store the raw data.

Finbarr
  • 31,350
  • 13
  • 63
  • 94