I'm developing a web application that will need to query a very large population density database. The data is basically latitude, longitude, and population count (you could abstract this as lat, lon, pop).
I'm going to be using MySQL and PHP for this. The data is very granular and so the total number of points is very large — on the order of several billions. (I don't actually know how large it is at this point, to be honest; I don't yet have the complete data, just samples to play with.) The server is just a cheap server farm machine (e.g. Bluehost).
The application will be trying to aggregate population data for circles of arbitrary radius centered at specific latitude and longitude coordinates. So basically I'll be saying, "tell me the total population for a circle of x radius centered at lat, lon." Which will likely involve just writing a very simple summing function.
Given all of the above, and the desire for this to be as fast and efficient as possible, my question is: what's the best database schema? I've looked here and learned some nice basics regarding the storing of this kind of data (floats work fine for the lat/lon data, the BETWEEN statement is faster than < and > operators), but I'm wondering if, given the bulk of the data, there's a better way to do this than having a table of several billion rows and three fields (e.g. lat, lon, pop).
Possible ideas that occurred to me would be separating out all of the major longitude headings into separate tables, but I don't know if that would really speed things up much. (I don't know a lot about MySQL optimization other than indexing.) Another related idea is to store super long strings of hex data for various "chunks" of the data (e.g. specific lat x lon squares.) Another alternative is to essentially use big binary bitmaps and just decode them on the fly (this strikes me as impractical for my relatively cheap server to manage, however).
But I'm not a DBA, and not even really much of a programmer (I'm not a beginner, but I'm not a professional), so I'd love to hear any other suggestions for how to do this, and whether it's a nuts idea to begin with given the current server processing power.