0

I have been reading this answer by @XenphYan

How does database indexing work?

But there are few things I'm unable to understand

Due to the fact that a number of records can only be sorted on one field, we can state that searching on a field that isn’t sorted requires a Linear Search which requires N/2 block accesses (on average), where N is the number of blocks that the table spans. If that field is a non-key field (i.e. doesn’t contain unique entries) then the entire table space must be searched at N block accesses.

Firstly I unable to understand the above part.

Question 1. How come searching on a field that isn't sorted would requires a Linear Search N/2 block accesses?

Whereas with a sorted field, a Binary Search may be used, this has log2 N block accesses. Also since the data is sorted given a non-key field, the rest of the table doesn’t need to be searched for duplicate values, once a higher value is found. Thus the performance increase is substantial.

I fail to understand this part too, it seem to me the above point specific that on a sorted field one can employ Binary Search.

What I feel ... as per my understanding a Binary Search are applied only to index field. (i.e To improve search performance)

POC:

Here how my table looks like

create table sqltemp(
  id INT PRIMARY KEY, 
  name CHAR(50),
  counter INT
);

insert into sqltemp(id,name,counter) values(1,'viren',1),(2,'vikas',2),(3,'vikram',3);  

EXPLAIN ANALYZE  select id from sqltemp where counter=2;

Seq Scan on sqltemp  (cost=0.00..14.25 rows=2 width=4)
   Filter: (increment = 2)

Atleast the EXPLAIN command does not specific that any Binary Search Sign (i.e it does a Seq Scan)

So, My question 2 is

Question 2: Is my understand correct that Binary Search in DB are only applicable for indexed field.

Community
  • 1
  • 1
LivingColors
  • 123
  • 7

1 Answers1

0

Question 1. How come searching on a field that isn't sorted would requires a Linear Search N/2 block accesses?

The important part is on average. If you're searching for one item out of N, 50% of the time, it'll happen to be in the first half of the items that you check. This is just a fancy way of saying that there's no quicker way of finding the target than checking each row.

Question 2: Is my understand correct that Binary Search in DB are only applicable for indexed field.

A sequential scan means that the database engine is looking at each row, which is O(n). If you write a query that takes advantage of an index, the EXPLAIN will instead tell you that it is able to use an index scan, which is O(log2(n)), which is much faster on large tables. To see this in your example database, try:

EXPLAIN ANALYZE  select * from sqltemp where id=2;

The primary key is an indexed field, which means that the database engine is storing it in a binary tree.

Andrew Rueckert
  • 4,858
  • 1
  • 33
  • 44