139

Recently I've learned the wonder of indexes, and performance has improved dramatically. However, with all I've learned, I can't seem to find the answer to this question.

Indexes are great, but why couldn't someone just index all fields to make the table incredibly fast? I'm sure there's a good reason to not do this, but how about three fields in a thirty-field table? 10 in a 30 field? Where should one draw the line, and why?

Vael Victus
  • 3,966
  • 7
  • 34
  • 55
  • 7
    try inserting a value into a table with over 10k entries that is indexed, all entries have to get updated because of inserts/delete and this is a huge time overhead and somewhat of a memory overhead if each value has an index – Jesus Ramos Mar 27 '11 at 01:08
  • 5
    There is one more reason besides space and write performance: using [multiple indexes for a single table access is very inefficient](http://use-the-index-luke.com/sql/where-clause/searching-for-ranges/index-combine-performance). That means, even if you have one index on each column, select performance is not very good if multiple columns are accessed in the WHERE clause. In that case, a multi-column index is best. – Markus Winand Apr 09 '11 at 19:05
  • 1
    if you are having a table with 30 fields you should really look at your table structures. They should be very hard to work with. – webs Jan 05 '20 at 09:04
  • Because indexes slow down writes. – Janac Meena Feb 14 '22 at 20:25

7 Answers7

155

Indexes take up space in memory (RAM); Too many or too large of indexes and the DB is going to have to be swapping them to and from the disk. They also increase insert and delete time (each index must be updated for every piece of data inserted/deleted/updated).

You don't have infinite memory. Making it so all indexes fit in RAM = good.

You don't have infinite time. Indexing only the columns you need indexed minimizes the insert/delete/update performance hit.

Brian Roach
  • 76,169
  • 12
  • 136
  • 161
  • 16
    Nice casual answer to give general understanding, but not much help in actually determining where to draw the line on indexes. How can you know? Just add them to commonly WHERED fields and hope for the best? – Andrew Jul 12 '17 at 15:34
  • 2
    @Andrew a year and a half later, did you find the answer to your question? – Sinjai Jan 11 '19 at 17:10
  • 3
    @Sinjai Adding them to commonly where'd columns is a good rule of thumb probably. But otherwise you could do a lot of reading it turns out if you want to become expert on indices. eg. https://stackoverflow.com/questions/3049283/mysql-indexes-what-are-the-best-practices – Andrew Jan 11 '19 at 18:28
  • 2
    Don't forget disk space. – jpmc26 Mar 02 '19 at 21:28
31

Keep in mind that every index must be updated any time a row is updated, inserted, or deleted. So the more indexes you have, the slower performance you'll have for write operations.

Also, every index takes up further disk space and memory space (when called), so it could potentially slow read operations as well (for large tables). Check this out

AndyMac
  • 820
  • 6
  • 11
  • 6
    The link is for *MS SQL Server*; this question is for **MySQL** – OMG Ponies Mar 26 '11 at 23:38
  • 6
    @OMG most of the points in the link applies to all major RDBMS – RichardTheKiwi Mar 27 '11 at 00:16
  • 5
    @Richard aka cyberkiwi: Indexes are not covered by ANSI - it's a miracle each vendor has used similar terminology. But even then, only SQL Server and MySQL use the terminology "clustered" and "non-clustered" index -- it means more in SQL Server than MySQL at that. There is nothing to guarantee that recommendations for one vendor should be applied to another. – OMG Ponies Mar 27 '11 at 02:38
  • 3
    @omg the first 6 points apply to any dbms. skip the non/clustered ones, then down below are more points regarding general indexing, also on point. If you have specific things you want to point out, call them. Otherwise it just looks like you are negating all answers which from the comments (including your deleted answer), that nobody agrees with your assessment. – RichardTheKiwi Mar 27 '11 at 02:40
11

You have to balance CRUD needs. Writing to tables becomes slow. As for where to draw the line, that depends on how the data is being acessed (sorting filtering, etc.).

Smandoli
  • 6,919
  • 3
  • 49
  • 83
2

Indexing will take up more allocated space both from drive and ram, but also improving the performance a lot. Unfortunately when it reaches memory limit, the system will surrender the drive space and risk the performance. Practically, you shouldn't index any field that you might think doesn't involve in any kind of data traversing algorithm, neither inserting nor searching (WHERE clause). But you should if otherwise. By default you have to index all fields. The fields which you should consider unindexing is if the queries are used only by moderator, unless if they need for speed too

  • In addition, one can considered so called covered or covering index. Such index contains the columns the query returns together with the columns the query uses for WHERE clauses, grouping, or any other part of the query. https://orangematter.solarwinds.com/2019/02/01/covering-indexes-in-mysql-postgresql-and-mongodb/ – OSGI Java Feb 05 '23 at 18:27
1

It is not a good idea to indexes all the columns in a table. While this will make the table very fast to read from, it also becomes much slower to write to. Writing to a table that has every column indexed would involve putting the new record in that table and then putting each column's information in the its own index table.

1

this answer is my personal opinion based I m using my mathematical logic to answer

the second question was about the border where to stop, First let do some mathematical calculation, suppose we have N rows with L fields in a table if we index all the fields we will get a L new index tables where every table will sort in a meaningfull way the data of the index field, in first glance if your table is a W weight it will become W*2 (1 tera will become 2 tera) if you have 100 big table (I already worked in project where the table number was arround 1800 table ) you will waste 100 times this space (100 tera), this is way far from wise.

If we will apply indexes in all tables we will have to think about index updates were one update trigger all indexes update this is a select all unordered equivalent in time

from this I conclude that you have in this scenario that if you will loose this time is preferable to lose it in a select nor an update because if you will select a field that is not indexed you will not trigger another select on all fields that are not indexed

what to index ?

foreign-keys : is a must based on

primary-key : I m not yet sure about it may be if someone read this could help on this case

other fields : the first natural answer is the half of the remaining filds why : if you should index more you r not far from the best answer if you should index less you are not also far because we know that no index is bad and all indexed is also bad.

from this 3 points I can conclude that if we have L fields composed of K keys the limit should be somewhere near ((L-K)/2)+K more or less by L/10

this answer is based on my logic and personal prictices

0

First of all, at least in SAP - ABAP and in background database table, we can create one index table for all required index fields, we will have their addresses only. So other SQL related software-database system can also use one table for all fields to be indexed.

Secondly, what is the writing performance? A company in one day records 50 sales orders for example. And let assume there is a table VBAK sales order header table with 30 fields for example each has 20 CHAR length..

I can write to real table in seconds, but other index table can work in the background, and at the same time a report is tried to be run, for this report while index table is searched, ther can be a logic- for database programming- a index writing process is contiuning and wait it for ending ( 5 sales orders at the same time were being recorded for example and take maybe 5 seconds) ..so , a running report can wait 5 seconds then runs 5 seconds total 10 seconds..

without index, a running report does not wait 5 seconds for writing performance..but runs maybe 40 seconds...

So, what is the meaning of writing performance no one writes thousands of records at the same time. But reading them.

And reading a second table means that : there were all ready sorted fields.I have 3 fields selected and I can find in which sorted sets I need to search these data, then I bring them...what RAM, what memory it is just a copied index table with only one data for each field -address data..What memory?

I think, this is one of the software company secrets hide from customers, not to wake them up , otherwise they will not need another system in the future with an expensive price.