1

Let's say I have a collection whose documents looks like access log, something like

{
    "_id" : ObjectId("599bd2ee7e50996104f2bc3e"),
    "requesting_user" : "-",
    "method" : "GET",
    "size" : 0,
    "remote_ip" : "49.35.22.166",
    "timestamp" : ISODate("2017-08-19T10:59:04Z"),
    "http_version" : "1.1",
    "response_code" : "304",
    "referrer" : "-",
    "client" : "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.78 Safari/537.36",
    "request" : "/"
}
{
    "_id" : ObjectId("599f08407e50992e2203e9da"),
    "timestamp" : ISODate("2017-08-23T23:53:54Z"),
    "response_code" : "200",
    "referrer" : "https://www.dummysite.com/Pink-Easy-Baby-Blanket-and-Hat",
    "requesting_user" : "173.89.22.74",
    "remote_ip" : "173.89.22.74",
    "client" : "Mozilla/5.0 (iPad; CPU OS 10_3_3 like Mac OS X) AppleWebKit/603.3.8 (KHTML, like Gecko) Version/10.0 Mobile/14G60 Safari/602.1",
    "method" : "GET",
    "http_version" : "2.0",
    "size" : 18508,
    "request" : "/baby-blanket-set/"
}
......
......
......

I want to count by some regex to find out how many visitors from a specific OS. So for windows I can use regex '/windows/gi', for linux I can use /linux/gi and so on in client field.

to get visitors from windows only I can use mongo's aggregation framework like

db.access_logs.aggregate([  
   {  
      '$match':{  
         'client':/windows/gi
      }
   },
   {  
      '$group':{  
         '_id':null,
         'windows':{  
            '$sum':1
         }
      }
   }
])

or more simply

db.access_logs.find({'client' : /windows/gi }).count()

But how can I count these occurrences in mongodb so that I get result something like

{ 
  'windows' : 1234,
  'linux' : 557
  ....

} 
Pavel_K
  • 10,748
  • 13
  • 73
  • 186
Sigma
  • 742
  • 2
  • 9
  • 24
  • Honestly, do yourself a favor and update all your data to extract this information from the string and store is as another property i.e `os: 'windows'` on the document instead. If you update all your data and make sure that all new data is written with the same extraction, it's a lot easier to do further aggregations on than juggling something else with an over-complicated statement. Fix up your data and make your life easy. – Neil Lunn Aug 31 '17 at 14:28
  • @NeilLunn do you happen to know query for that ? Or should I use JavaScript to do that ? – Sigma Aug 31 '17 at 14:33
  • [Update MongoDB field using value of another field](https://stackoverflow.com/questions/3974985/update-mongodb-field-using-value-of-another-field) – Neil Lunn Aug 31 '17 at 14:35
  • So in short "use a program/script". Also I'm pretty sure there are a couple of libraries out there with a method to parse that "client" string and turn it into an object of definitive properties ( ie OS, Browser, Version ). So "If it were me" doing this, then I would find such a library and look at the object properties it produces. Then use that to read and update the data I have, and then subsequently use the same library to parse all new log files and write new data, with a similar break down of additional properties. But that's just how I would do it. – Neil Lunn Aug 31 '17 at 14:40

0 Answers0