0

I was reading this question MySQL - why not index every field? and decided to do some tests.

Having two equivalent tables (a and b), each one with 5.000.000 rows and 3 columns (int, int, varchar), I indexed all the columns on table b and just indexed one column (primary key) on table a.

Results were as follow:

Edit: times are the same as the below ones for rows changed/unchanged.

mysql> update a set c = '1000000' where c = '1000000';
Query OK, 0 rows affected (2.13 sec)
Rows matched: 1  Changed: 0  Warnings: 0

mysql> update b set c = '1000000' where c = '1000000';
Query OK, 0 rows affected (0.00 sec)
Rows matched: 1  Changed: 0  Warnings: 0

Now, this is totally the opposite of what I was expecting after reading the answers to the question.

Is there any "benchmarkable" drawback to not index every column of a table?


Edit: problem was that every tuple in the table had an unique value and updating a single indexed and unique row was quick enough to get the 0.00 mark. However, when I updated subsets of rows like Flimzy and Searlea specified, updating non indexed columns were quicker than updating indexed one, like it should be.

Results in case you want to see them:

mysql> update a set b = 98;
Query OK, 5000002 rows affected (10.85 sec)
Rows matched: 5000002  Changed: 5000002  Warnings: 0

mysql> update b set b = 98;
Query OK, 4988890 rows affected (1 min 32.69 sec)
Rows matched: 5000001  Changed: 4988890  Warnings: 0
Community
  • 1
  • 1
Ben
  • 16,275
  • 9
  • 45
  • 63

2 Answers2

2

Of course there are drawbacks to indexing every column. With more indexes, writes (INSERT, UPDATE, and DELETE) take longer, because there are more things to write to the database.

Indexes can only speed things up for reads. Meaning, SELECT, DELETE...WHERE and UPDATE...WHERE.

Jonathan Hall
  • 75,165
  • 16
  • 143
  • 189
  • Thing is that table a, which has only one primary key on the first column, updates in 2.13 seconds while table b which has indexes in all three columns takes 0.00 seconds. – Ben Jun 24 '11 at 20:28
  • Sorry, I didn't read closely enough. The answer, then, is that the WHERE was able to benefit from the index. You don't tell us what data is in your table, however if 'c' is unique in every row (or as in y our example, you search for a non-existent value), then yes, it will be faster to do an indexed query in that case. If you try doing a mass update (say: UPDATE a SET c=c+1) on all the columns, the un-indexed version will be faster. – Jonathan Hall Jun 24 '11 at 20:31
  • @searlea See the edit in the question, times are same for changed or unchanged rows (2.xx secs and 0.00 secs). – Ben Jun 24 '11 at 20:32
  • This is why you'll see many examples that disable indexing before doing a big insert/update, then re-enable it again afterwards. Doing one massive index update at the end can be much faster than doing gazillions of them individually. – Jonathan Hall Jun 24 '11 at 20:32
  • @Flimzy I see now your point. Yes, they where unique values which were taking advantage of the index, however in a bulk update without indexes 5.000.000 rows took 10.85 secs while with indexes it took 1 minute 32.69 seconds. Makes sense now, thanks! – Ben Jun 24 '11 at 20:38
  • @Ben MySQL took 2.xx seconds to *find* the row it might want to update - and then decided it didn't need to. Of course it was faster to *find* the row on table b. See my full answer for a better example. – searlea Jun 24 '11 at 20:38
1

I'm assuming both tables contains columns called a, b and c, and they're all numeric (though c is stringified.) Assuming a is your primary key column, this should show a benchmarkable difference as both updates benefit from being able to find the affected rows, while table b should take the hit when it has to update an index:

update a set b = b + 1 where a between 100000 and 200000;
update b set b = b + 1 where a between 100000 and 200000;

Or, in general terms:

update sane set col2 = col2 + 1 where pk between X and Y;

should perform better than:

update fullindexed set col2 = col2 + 1 where pk between X and Y;
searlea
  • 8,173
  • 4
  • 34
  • 37
  • My problem was that every tuple in the table had an unique value and updating a single indexed and unique row was quick enough. However, when I update subsets of rows like you specify, updating non indexed columns are quicker than updating indexed one, like it should be. Thanks! – Ben Jun 24 '11 at 20:47