Is sequential disk read actually sequential?

Question

I'm using PostgreSQL 9.4.

First, I have postgreSQL installed on a system with the only one ssd-drive.

I'm trying to understand what sequential read is and end up with some issue. For instance, if we ask for an SQL-Server to give us some unindexed data, the seq-scan is likely to be happen. But What if two different clients ask for data from two different tables simultaneously? In this case, sql-server creates two different processes for each client and executes the queries concurrently.

But if the queries are being executed concurrently, the head of the drive need to jump from the area the first table is stored to the area the second is.

So, we actually have no sequntial read, jumping between the tables' areas. Where am I worng? Couldn't you explain those things a bit?

@user2864740 Well, yes, but I'm not considering this case... If the server is just run, we have all cahaches empty. — St.Antario, Sep 10 '15 at 08:45
"it might already be in the buffers" --- to be fair, in most cases data is only read from the filesystem cache. — zerkms, Sep 10 '15 at 08:45
@zerkms You mean what planner exposes as `shared hits`? But we can;t assume the whole table is in the chache. That's what I do know. So, heap reads will happen anyway. — St.Antario, Sep 10 '15 at 08:49
even if it's a cold start most applications don't read from the storage device directly, but from cache only. Which means the kernel loads data to the cache first, then your application reads it from there. Think of it as of a read-through cache. — zerkms, Sep 10 '15 at 08:49
Your question is about MS SQL Server, but you tagged it PostgreSQL. Which? PostgreSQL and MS SQL use pretty different methods of doing disk access AFAIK. — Craig Ringer, Sep 10 '15 at 09:21

score 4 · Accepted Answer · answered Sep 10 '15 at 08:43

4

"sequential scan" means a table was read from the beginning to the end, sequentially row by row. It means nothing in terms of how data is read from physical storage.

So the term is about logical reads.

Not sure if the answer needs more explanation.

answered Sep 10 '15 at 08:43

zerkms

249,484
69
436
539

Well, I thought the terms sequential scan and random_block_read are tied heavily with ones applied to a disk. I.e. I presume, but don;t know for sure, SQL-server stores data of each table it has sequentially, block-by-block in the disk. – St.Antario Sep 10 '15 at 08:44
1

@St.Antario "SQL-server stores data of each table it has sequentially, block-by-block in the disk." --- why do you think those blocks are physically organised one after another? – zerkms Sep 10 '15 at 08:45
Because it's likely to make read operation more optimal while doing it sequentially instead of random. But like I already told, I don;t know it for sure. – St.Antario Sep 10 '15 at 08:46
"Because it's likely to make read operation more optimal" --- it would be more optimal indeed. As if every road was straight without traffic lights - you would get to your office faster. It's utopic though. – zerkms Sep 10 '15 at 08:47
@St.Antario: even if SQL Server stores the data in a clustered index (which means the data is _logically_ sorted on disk) that still doesn't mean the data is written _phyiscally_ that way. The data file might not be contiguous on the disk. And if there is a RAID system in the background (which is basically the default for any serious production DB Server) then file is most definitely not "contiguous" because it's spread out over multiple disks. Plus: Postgres dos not have a clustered index, so the data is never stored (logically) in a "sorted" manner. – Sep 10 '15 at 08:51
@St.Antario Neither the SQL Server nor PostgreSQL terms random read/sequential scan refers to what's happening underneath the file system, at the disk block level. – nos Sep 10 '15 at 08:51
@a_horse_with_no_name But reading indexes means some amount of random-read operations to be occured. And by default, the random read operations have 4 times more cost than sequential-ones have. So, what does random-read while scanning indexes mean from PostgreSQL standpoint? – St.Antario Sep 10 '15 at 08:53
@St.Antario "random read" means chunks of the data structure were accessed randomly (as it comes from the name). – zerkms Sep 10 '15 at 08:54
@zerkms Randomly in the sense of chosing any random block from all the index contains? But we actually perform btree-traversing while reading btree indexes. And we should traverse from the root elemnt to the bottom one. Where is the random read may occur here? I'm confused.... – St.Antario Sep 10 '15 at 08:57
1

@St.Antario not random from application logic perspective, but from the access patterns. When you need to retrieve the "middle element" (as in the binary search) - you perform a random access of the Nth element of the structure. – zerkms Sep 10 '15 at 08:59
@zerkms Ah, it seems now I see... It's just called `random access` and has nothing to do with the random disk-I/O as I initially thought. The concept lays "on top of the hardware operation". Can I think about it that way? – St.Antario Sep 10 '15 at 09:33
1

@St.Antario forget about storage at all. – zerkms Sep 10 '15 at 09:34

Is sequential disk read actually sequential?

1 Answers1

Linked