0

This is killing me - everybody say what it is but noone points to a guide or teach the basics.

  1. Is it something that is better done from the start or can you index it as easily if your loading times are getting longer?

  2. Has anyone found any good starting point for someone who's not a pro in databases? (I mean indexing starting point and don't worry, I know the basics of databases) Main rules, good practise etc.

Im not here to ask you to write a huge tutorial but if you're really, really bored - go ahead. :)

Im using Wordpress if that's important to know. Yes, I know that WP uses very basic indexing but if it's something good to start with from the beginning, I can't see a reason why not to.


  1. It's barely related but I also didn't find answer online. I can guess the answer but Im not 100% sure - what's more efficient way to store data with same key: in array or separate rows (separate ids but same keys)? There's usually maximum of 20 items per post & the number of posts could be in thousands in future. Which would be a better solution?

Different rows, ids & values BUT same key

id | key |values|
--------------------
25 | Bob | 3455 |
--------------------
24 | Bob | 1654 |
--------------------
23 | Bob | 8432 |

Same row, id & key BUT value is serialized array

id | key |      values      |
------------------------------
23 | Bob | serialized array |
------------------------------
Solo
  • 6,687
  • 7
  • 35
  • 67
  • Usually indexing is included in the DDL (Data Definition Language); IOW, when the database structure is decided upon. You can, though, alter the table after the fact and create an index later. – B. Clay Shannon-B. Crow Raven Dec 01 '15 at 23:25
  • 1
    Your question is too broad, but this was the guide that helped me get started: [Use the Index, Luke!](http://use-the-index-luke.com/). – jpmc26 Dec 02 '15 at 01:48

2 Answers2

2

If you want a quick rule of thumb, index any columns in a table that you will be using to lookup rows. For example, I may have a table as follows:

id| Name| date     |
--------------------
0 | Bob | 11.12.16 |
--------------------
1 | John| 15.12.16 |
--------------------
2 | Tim | 19.12.16 |

So obviously your ID is your primary index, but lets say you have a page that will SORT the whole table by DATE, well you would add date as an index.

Basically, indexes make it a lot faster for the engine to find specific records or order them by a specific column. They do a lot more, but when I am designing sites for myself or little tools for the office at work, I usually just go by that.

Large corporate tables can have thousands of indexes and even more relations between tables, but usually for us small peasant folk, what I said should be enough.

Matthew Goulart
  • 2,873
  • 4
  • 28
  • 63
  • Check it out, I made a table too in my question. – Solo Dec 01 '15 at 23:46
  • Sorry for double comment... To answer your 3rd question, I personally would create a new row per data field, in some less common cases, I've seen code that will dynamically create a table for each key, so say bob registers, he gets his own table with all of his thousands of entries and so on. Bottom line: You should probably store each data field in a new row, and point it to bob, etc... – Matthew Goulart Dec 02 '15 at 01:14
  • And also, to elaborate on why the serialized data may not be a good idea is if you are looking for only one data field, having it serialized will force you to retrieve and parse the ENTIRE array just to find one field. This will increase overhead A LOT, and lengthen your code. – Matthew Goulart Dec 02 '15 at 01:16
  • Okay. I asked this because I have specific sitation and I know for sure that I only need the whole array because data in it is always used together. My concern is that this is going to be most populated table and Im afraid that too many rows spends too much space + slows down queries. With all other data from single post (around 20) + all these with same key (around 20) would make like 40 rows per post. Having 50 000 posts is **2 million** rows but it would be **1 million** if I stored it as serialized array.. Am I thinking correctly? – Solo Dec 02 '15 at 01:30
  • Sounds like some other re-thinking and optimization should be done on the dB structure, but you can also look into compressing and archiving old parts of the dB, so that you don't ever end up with 2 million rows. – Matthew Goulart Dec 02 '15 at 01:58
  • Also, if you are using the entire array, you can find a smart way to store it. For example you can store it as text like this: val1;val2;val3... Rather than a bulky serialization like this: [[value1:x, etc]], that will save space – Matthew Goulart Dec 02 '15 at 02:01
  • It is true what the others are saying though, this is a very loaded question and putting in some extra time with documentation may really help. Either way, you'll figure it out! – Matthew Goulart Dec 02 '15 at 02:03
  • You're probably right but that's how Wordpress is set up - post meta goes all to one table. I better let it rest for now & use the array (not sure though if to serialize or use `explode()` with string). Im sure that it doesn't become a huge problem in near future & if I really have too many rows, I probably have resorces to deal with it. Thanks for your help! – Solo Dec 02 '15 at 02:04
0

You're asking a really complicated question. But the tl;dr; A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space to maintain the index data structure.

more detailed info is already provided in the thorough answer here: How does database indexing work?

Community
  • 1
  • 1
Paul
  • 101
  • 2