1

Look at this 1st scenario, you got a table that has 2 columns- Parent (P) & Child (C).

P-C
1-3
2-8
3-6
6-4
8-7

When users search for all descendants of "1" then it will show:

P-C
1-3
3-6
6-4

& When users search for all descendants of "2" then it will show:

P-C
2-8
8-7

This is the Mysql query to get the Data

select distinct col1
from (select col1,
             @pv:=(case when find_in_set(col2, @pv) then @pv else concat(@pv, ',', col2) 
                   end) as 'col2'
      from table1 join
          (select @pv:='1') tmp
          on find_in_set(col1, @pv) > 0
     ) t

Ok, you know that Database indexing is to index the column so that the DB can look up DB faster than without indexing.

However, in the 1st scenario mentioned above, "Do you think DB indexing play any important role in Parent Child table?"

Ok, if the users search for all descendants of "2", then the DB first found "2-8", then it has to jump over 2 records before it found the next child "8-7".

This is the simple example, but what if there're thousands of records located far away from each other (or position of data is very fragmented), then "How DB (assuming that the Parent-Child column got indexed) can look up data quickly in the 1st scenario?"

But if we make all the descendants sit next to each other like in this 2nd scenario:

P-C
1-3
3-6
6-4
2-8
8-7

then "Will DB (even we don't index Parent Child column) look up data faster in the 2nd scenario than the 1st one?"

Note: If you reverse the order of the descendants like this:

P-C
6-4
3-6
1-3
2-8
8-7

& if you search for "1" then it will show only "3", it won't show "3-6" & "6-4" since "3-6" & "6-4" are not in the continuous order. It means that the MYSQL, when running the above query, will search the record from the top to down. So it means that Mysql won't search for the next descendants from the beginning, -> Do you think so?

Note: Pls also read this link @ Symbol - a solution for Recursive SELECT query in Mysql?

Community
  • 1
  • 1
Tum
  • 3,614
  • 5
  • 38
  • 63
  • a parent of a child can be a child of other higher parent, so i called "get all children of the highest parent" – Tum May 23 '13 at 08:47
  • the text was modified – Tum May 23 '13 at 08:56
  • As with a true parent / child listing (where there can be multiple children for each parent and there is no limit to the number of levels of children), MySQL has no way of supporting queries on this beyond recursive use of queries, hence indexes are vitally important. If the parent / child relationship is simple with a only a single child per parent and a very limited number of levels then you might be able to get away without an index on the parent / child and rely on an index on the ordering to quickly return records in the right order, but likely parent / child indexes would still be best – Kickstart May 23 '13 at 09:02
  • my question is If i indexes the P & C column will DB look up the data faster if it have to look up P-C like that? – Tum May 23 '13 at 09:10
  • I love to index but not sure index will help query to get Data faster in the 1st scenario? – Tum May 23 '13 at 09:11
  • Don't think it will care. You will have to do one query to get the first record and the id of the child. Then another query to get the record for the child and the id of its child, etc. The order of the non required records for the first query is irrelevant to the 2nd query. – Kickstart May 23 '13 at 09:16
  • So, that is why i asked this question, i want to know how the Mysql DB work? will it search for the rows sitting next to each other or will it search from the beginning? – Tum May 23 '13 at 09:18
  • The 2nd query has nothing to do with the first, and in the 2nd query it has no idea of what record comes after the record found in the first query unless it performs that first query again. Relying on a natural order of the records on the table is very dodgy (as I said in the original thread, the solution you used will fail when the natural order changes, such as an extra child being added). – Kickstart May 23 '13 at 09:25
  • When extra child was added, we will remove all the IDs of the old ones & create the new ID set with the ASC order – Tum May 23 '13 at 09:29
  • But the order data is returned is not specified. MySQL might well give them out in the order you insert them currently, but do not believe that this is specified. While it currently works on an extremely simple set of data (it wouldn't work in your example data in the original post when a parent can have 2 children), the indexes here are probably irrelevant as it is likely searching every record on the table. – Kickstart May 23 '13 at 09:33
  • So, u confirmed that Indexing has no role in the 1st & 2nd scenario? Ie even we index or not to index the DB still has to search again? & Db won't look up the next record immediately? Is that what you mean? – Tum May 23 '13 at 09:37
  • Your query does not search for an indexed field so an index will not help performance. But the query only works on a very simple set of data and then only by luck. It doesn't use the index to search again because it is never searching for rows from the table, merely checking if the next random record retrieved is one that it already knows it needs. – Kickstart May 23 '13 at 09:39
  • Ok, look at the query, if we reverse the order of the children, then the query will not show anything right? So it means, for the above query, it will search from top to bottom? --> & this is very important – Tum May 23 '13 at 09:47
  • No it doesn't SEARCH at all. It grabs every single record and looks at the first one. Then looks at the next record (a record it already has - no index used to get that record) and sees if it is one it is interested in (it hasn't search for this record, it is merely the next one it has pulled back). With no order specified MySQL is bringing it back in whatever order it wants. This MAY be the order they were inserted in but that (as far as I know) is not specified and so could change at any time or be affected by anything else in the query. – Kickstart May 23 '13 at 09:53
  • U said "It grabs every single record and looks at the first one. Then looks at the next record" Let say the data in the reverse order as showed in the question, when it search for "1", it will find "3" right (of course it alrealdy passed 6-4 & 3-6, then it will check 2-8 & 8-7 right? & it won't see any children then it continue to search the next right? So if we make the data sit next to each other then it will be likely that it will return the data quicker than if the data was fragmented? – Tum May 23 '13 at 10:13
  • Replying with a full answer. – Kickstart May 23 '13 at 10:14
  • The data was fragmented if say 1-3 in the record no 1, but 3-6 is in the record no 1000 & 6-4 is in the row 5000. Clearly, the Data have to return the result much more slowly than if the rows were closed to each other. DO u think so? – Tum May 23 '13 at 10:16
  • No, because it will have processed every one of those records. With your query if its wants 2 records out of 5000 it will have to process all 5000 of them irrespective of whether the 2 its wants are the first 2 it processes or the last 2. – Kickstart May 23 '13 at 10:28

1 Answers1

1

With your data as such

P-C
1-3
3-6
6-4
2-8
8-7

MySQL will find 5 records and assuming that it chooses to return them in this order (it could return them in the order of the prices of items 1, 3, 6, 2 and 8 on the menu in the Oracle canteen today):-

The first record is 1, and it will store 3 (ie, the child) in the variable pv. It will then get the next record. This is record 3 and it will see if that is stored in pv and finding it, 6 will be concatenated onto the end of pv. It will then get the next record (6 in this case), check if 6 is stored in pv and as it is concatenate 4 onto the end of pv. It will then get the next record (2 in this case), check if 2 is stored in pv but as it isn't it will ignore it. It will then get the next record (8 in this case), check if 8 is stored in pv but as it isn't it will ignore it.

It will continue to process every single record on the table whether you want them or not. It will not use any indexes to do any of these checks, or to stop processing until it gets to the end of all the records.

MySQL (and relational databases in general) are designed to get sets of data, and work well for comparing one set with another set. The above query is getting a single set of data (potentially hideously large) and going through every returned record in a random order (which you hope is the order you entered them in) checking each one against a variable it is building up.

Kickstart
  • 21,403
  • 2
  • 21
  • 33
  • I Agree with u that it will go through all records, so u confirmed that the order of record has no impact on the speed of the query right? I want to see how other people answers, cos i think the DB should work similar to the file system in PC, why in PC they suggest u to defragment the hard disk. Besides, there's a defragmentation in DB – Tum May 23 '13 at 10:36
  • It does work similarly to the HD. Just that with the query you are doing the equivalent of reading the entire hard disk into a sequential store and processing all of it. A defrag wouldn't help for this at all. However a defrag would help if you accessed a single particular file which was split all over the place. The other similarity is with the HD being read sequentially you might find the end section of the file before you find the start of the file, just like the query. – Kickstart May 23 '13 at 11:03
  • Ok, but there index defragmentation in DB, u can search on intgernet – Tum May 24 '13 at 00:50
  • Quite possibly, but your query is not taking any account of the indexes. – Kickstart May 24 '13 at 08:12