1

I was trying to do the analysis of weblog files by R. I am comfortable to deal with the date and bytes, wherever numeric data is present but fail to deal with the strings.

From the log file (log file in CSV format), I want to find out the particular user (with help of IP and Agents) and its total spending on the web page.

furianpandit
  • 161
  • 7
  • It's look like you want someone to do the job for you. What did you try so far ? – dickoa Sep 22 '12 at 12:25
  • possible duplicate of [Logfile analysis in R?](http://stackoverflow.com/questions/5664997/logfile-analysis-in-r) – Paul Hiemstra Sep 22 '12 at 13:46
  • @dickoa : Whatever the work I have done, I was trying to put the snap of it over here but recent member are not eligible to put the snap that i got when I was trying to share it.... – furianpandit Sep 24 '12 at 03:46

2 Answers2

2

There are numurous libraries to do this kind of analysis, although I could find none in R. A google for parse apache logfile yielded a library in Perl, and python parse apache logfile yields the Scratchy library. Both rely on regular expressions to parse the contents of the file.

From here there are two ways to deal with the apache logfile:

  • Call perl or python from R, either using a direct link, or using a system call (this is simpler).
  • Take the idea from the perl or python lib and use it to implement R versions of the functions. This will take a lot of time.

You refer to a csv file, but I think the libraries above work with the original text file with the Apache log, so I'd use those, and not your csv file.

In addition, this SO post mentions an answer by @doug (profile) where he states that he has created some functions to create visualizations of apache logfile data, parsed by Python. Maybe you could send him a message or mail and see if he is willing to share the code.

Community
  • 1
  • 1
Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149
0

Logfile analysis in R is an interesting topic we had before, you can find our discussion right here. Maybe this discussion might also help you to adjust to the SO etiquette in order to get better feedback (not to take anything away from yours, Paul).

Community
  • 1
  • 1
Matt Bannert
  • 27,631
  • 38
  • 141
  • 207