I'm currently trying to convince management that we should move some of our data away from MS SQL and in to NOSQL (Probably MongoDB.) Specifically what I want to move is our WebStats system. Currently we have roughly 150 million rows in a table and this dataset is always growing (we store a years worth of stats.)
As a test I've run the following query 150 million times:
db.test.insert({ SiteId:1, PageUrl:"/home/", Impressions:1, Date: new Date(), IsCrawler:false, LanguageId:2057, ClientIpAddress:"1.2.3.4", DateTime: new Date( ), ReferalUrl: "http://www.google.com", UniqueUserGuid:1, BrowserName:"IE", Brow serVersion:11, BrowserAgent:"blah", IsAbcValid:true, hasChecked:true, connection Speed:1, Country:"UK", Region:"Midlands", City:"Coventry" })
I then execute this once:
db.test.insert({ SiteId:1, PageUrl:"/home/", Impressions:1, Date: new Date(), IsCrawler:false, LanguageId:2057, ClientIpAddress:"1.2.3.4", DateTime: new Date( ), ReferalUrl: "http://www.google.com", UniqueUserGuid:1, BrowserName:"IE", Brow serVersion:11, BrowserAgent:"blah", IsAbcValid:true, hasChecked:true, connection Speed:1, Country:"US", Region:"New York", City:"New York" })
Followed by:
db.test.ensureIndex( { "PageUrl": 1, "Date": 1, "ClientIpAddress": 1 } )
After the indexing has finished I ran the following search:
db.test.find({Country:/S/})
It eventually found the US document that I added but it took longer than it would in MS SQL. Am I indexing this incorrectly? I'm basically just trying to knock up a demonstration of the possible performance gains, so if anyone could point me to an example that deals with very large data sets then I'll gladly use that instead.
Thanks,
Joe