1

I recently started working for a fairly small business, which runs a small website. I over heard a co worker mention that either our site or MySQL databases get hit ~87 times a second.

I was also tasked today, with reorganizing some tables in these databases. I have been taught in school that good database design dictates that to represent a many-to-many relationship between two tables I should use a third table as a middle man of sorts. (This third table would contain the id of the two related rows in the two tables.)

Currently we use two separate databases, totalling to a little less than 40 tables, with no table having more than 1k rows. Right now, some PHP scripts use a third table to relate certain rows that has a third column that is used to store a string of comma separated ids if a row in one table relates to more than one row in some other table(s). So if they want to use an id from the third column they would have to get the string and separate it and get the proper id.

When I mentioned that we should switch to using the third table properly like good design dictates they said that it would cause too much overhead for such small tables, because they would have to use several join statements to get the data they wanted.

Finally, my question is would creating stored procedures for these joins mitigate the impact these joins would have on the system?

Thanks a bunch, sorry for the lengthy explanation!

El Yobo
  • 14,823
  • 5
  • 60
  • 78
cskwrd
  • 2,803
  • 8
  • 38
  • 51
  • The comma-separated list of ids sounds impossible to index. De-normalizing databases _can_ give speed improvements, but I think writing slightly more complex joins to break apart the string of ids is well worth the effort. – sarnold Mar 04 '11 at 04:58

3 Answers3

1

By the sound of things you should really try to redesign your database schema.

two separate databases, totalling to a little less than 40 tables, with no table having more than 1k rows

Sounds like it's not properly normalized - or it has been far to aggressively normalized and would benefit from some polymorphism.

comma separated ids

Oh dear - surrogate keys - not intrinsically bad but often a sign of bad design.

a third table to relate certain rows that has a third column that is used to store a string of comma separated ids

So it's a very long way from normalised - this is really bad.

they said that it would cause too much overhead for such small tables

Time to start polishing up your resume I think. Sounds like 'they' know very little about DBMS systems.

But if you must persevere with this - its a lot easier to emulate a badly designed database from a well designed one (hint - use views) than vice versa. Redesign the database offline and compare the performance of tuned queries - it will run at least as fast. Add views to allow the old code to run unmodified and compare the amount of code you need to performa key operations.

symcbean
  • 47,736
  • 6
  • 59
  • 94
0

It will be both quicker and more simple to do it in the database than in PHP; that's what database engines are good at. Make indexes on the keys (InnoDB will do this by default) and the joins will be fast; to be honest, with tables that tiny, the joins will almost always be fast.

Stored procedures don't really come into the picture, for two reasons; mainly, they're not going to make any difference to the impact (not that there's any impact anyway - you will be improving app performance by doing this in the DB rather than at the PHP level).

Secondly, avoid MySQL stored procedures like the plague, they're goddamn awful to write and debug if you've ever worked in the stored procedure languages for any other DB at all.

cskwrd
  • 2,803
  • 8
  • 38
  • 51
El Yobo
  • 14,823
  • 5
  • 60
  • 78
  • Stored procedures are perfectly fine in mysql and provide numerous efficiency benefits over inline sql and paramterised queries. If you discount using them based on El Yobo's reasoning then add yourself to that list of idiot workers :P – Jon Black Mar 04 '11 at 05:10
  • Well, that said, I do sometimes have the misfortune of using them myself... but having worked before with Postgres and SQL Server and their languages, MySQL's versions feel pretty crippled by comparison. The debugging info is awful and the capabilities are limited (last time I checked, it was still necessary to artificially construct and execute an invalid SQL statement in order to have a trigger reject an insert, update or delete). – El Yobo Mar 04 '11 at 05:18
  • And they also seem irrelevant for the problem described. – El Yobo Mar 04 '11 at 05:19
  • @TheGNUGuy - I thought my comment about your co-workers was appropriate; do they read SO or something? ;) – El Yobo Mar 04 '11 at 05:19
  • @El Yobo - I'm not sure if they read SO, but I might want to show this to them and it's not that they don't know what they're doing, I think it's they don't want to make such a big change to a lot of legacy code all at once. And I feel this question is a little pointed and casts a little bit of a shadow on them. – cskwrd Mar 04 '11 at 05:29
  • Fair enough. We're all stuck with terrible legacy code... but they should realise that they need to fix it, not make excuses. Getting rid of the lookup table will be faster, not slower, and far more simple. – El Yobo Mar 04 '11 at 05:31
0

I don't understand how storing a comma separated list of id's in a single column, and having to parse the list of ids in order to get all associated rows, is less complex than a simple table join.

Moving your queries into a stored procedure normally won't provide any sort of benefit. But if you absolutely have to use the comma separated list of values that represent foreign key associations, then a stored procedure may improve performance. Perhaps in your stored procedure you could declare a temporary table (see Create table variable in MySQL for example), and then populate the temporary table, 1 row for every value contained in your comma separated string.

I'm not sure what type of performance gain you would get by doing that though, considering like you mentioned there's not a lot of rows in any of the tables. The whole exercise seems a bit silly to do. Ditching the comma separated list of id's would be the best way to go.

Community
  • 1
  • 1
Shan Plourde
  • 8,528
  • 2
  • 29
  • 42