0

All-

I thing i've finally out grown MySQL for one of my solutions. Right now I have 70 million rows that simply store the x,y,z of objects in 3D space. Unfortunately I don't know how else to optimize my database to handle the inserts/queries anymore. I need to query based on distance (get objects within distance).

Does anyone have a suggestions on a good replacement? I don't know if I should be looking at something like hbase or non-relational databases, as I may run into a similar problem. I generally insert about 100 rows per minute, and my query looks like:

// get objects within 500 yards
SELECT DISTINCT `object_positions`.`entry` FROM `object_positions` WHERE  `object_positions`.`type` = 3 AND `object_positions`.`continent` = '$p->continent' AND SQRT(POW((`object_positions`.`x` - $p->x), 2) + POW((`object_positions`.`y` - $p->y), 2) + POW((`object_positions`.`z` - $p->z), 2)) < 500;

Nothing crazy complicated, but I think the math involved is what is causing MySQL to explode and I'm wondering if I should be looking at a cloud based database solution? It could easily have to handle 10-100 queries per second.

Geesu
  • 5,928
  • 11
  • 43
  • 72
  • Given that your `object_positions` table includes a field for `continent`, it sounds as though you are trying to store geographical locations - if so, try looking at the answer to this question: http://stackoverflow.com/questions/1006654/fastest-way-to-find-distance-between-two-lat-long-points –  Feb 10 '13 at 18:34
  • I can't use SPATIAL as I have x, y, z. It's not longitude/latitude only :( – Geesu Feb 10 '13 at 18:38
  • Do you need to store height above/below ground or sea-level? –  Feb 10 '13 at 18:41
  • Yes I have to store it otherwise I don't for sure know where the object is. (Could be in a cave, underwater, or flying, etc...) – Geesu Feb 10 '13 at 18:52
  • Is that going to be relevant to determining the distance from the specified point? If it is, then obviously you can't use SPATIAL, but if not then you could (with an additional field for altitude). –  Feb 10 '13 at 19:26

1 Answers1

3

It's not MySQL that's giving you trouble, it's the need to apply indexing to your problem. You have a problem that no amount of NoSQL or cloud computing is going to solve by magic.

Here's your query simplified just a bit for clarity.

SELECT DISTINCT entry 
           FROM object_positions 
          WHERE type = 3 
            AND continent = '$p->continent'
            AND DIST(x,$p->x, y, $p->y, z,$p-z) < 500

DIST() is shorthand for your Cartesian distance function.

You need to put separate indexes on x, y, and z in your table, then you need to do this:

SELECT DISTINCT entry 
           FROM object_positions 
          WHERE type = 3 
            AND continent = '$p->continent'
            AND x BETWEEN ($p->x - 500) AND ($p->x + 500)
            AND y BETWEEN ($p->y - 500) AND ($p->y + 500)
            AND z BETWEEN ($p->z - 500) AND ($p->z + 500)
            AND DIST(x,$p->x, y, $p->y, z,$p-z) < 500

The three BETWEEN clauses of the WHERE statement will allow indexes to be used to avoid a full table scan of your table for each query. They'll select all your points in a 1000x1000x1000 cube surrounding your candidate point. Then the DIST computation will toss out the ones that are outside the radius you want. You'll get the same batch of points but much more efficiently.

You don't have to actually create a DIST function; the formula you have in your question is fine.

You do have an index on (type, continent), don't you? If not you need it too.

O. Jones
  • 103,626
  • 17
  • 118
  • 172
  • Should I switch the x y z to ints for faster calculations? Or is leaving them in as a float ok? I actually store the raw x y z as a float, but only query x_int, y_int and z_int as I thought it would be faster than float operations? – Geesu Feb 11 '13 at 19:30
  • It makes no difference to performance whether you use `FLOAT` or `INT` for x,y, and z. They take the same amount of space as each other, and either one can be indexed. Presumably you've figured out that the precision available in the `FLOAT` datatype is sufficient for your application. If you're not sure what I'm talking about read this: http://dev.mysql.com/doc/refman/5.0/en/problems-with-float.html I would refrain from mixing float and int if I were you. – O. Jones Feb 11 '13 at 21:56