11

I know that my question has an answer here: QFile seek performance. But I am not completely satisfied with the answer. Even after looking at the following implementation of generic_file_llseek() for ext4, I can't seem to understand how can the complexity be measured.

/**
 * generic_file_llseek - generic llseek implementation for regular files
 * @file:       file structure to seek on
 * @offset:     file offset to seek to
 * @origin:     type of seek
 *
 * This is a generic implemenation of ->llseek useable for all normal local
 * filesystems.  It just updates the file offset to the value specified by
 * @offset and @origin under i_mutex.
 */
loff_t generic_file_llseek(struct file *file, loff_t offset, int origin)
{
        loff_t rval;

        mutex_lock(&file->f_dentry->d_inode->i_mutex);
        rval = generic_file_llseek_unlocked(file, offset, origin);
        mutex_unlock(&file->f_dentry->d_inode->i_mutex);

        return rval;
}

/**
 * generic_file_llseek_unlocked - lockless generic llseek implementation
 * @file:       file structure to seek on
 * @offset:     file offset to seek to
 * @origin:     type of seek
 *
 * Updates the file offset to the value specified by @offset and @origin.
 * Locking must be provided by the caller.
 */
loff_t
generic_file_llseek_unlocked(struct file *file, loff_t offset, int origin)
{
        struct inode *inode = file->f_mapping->host;

        switch (origin) {
        case SEEK_END:
                offset += inode->i_size;
                break;
        case SEEK_CUR:
                /*
                 * Here we special-case the lseek(fd, 0, SEEK_CUR)
                 * position-querying operation.  Avoid rewriting the "same"
                 * f_pos value back to the file because a concurrent read(),
                 * write() or lseek() might have altered it
                 */
                if (offset == 0)
                        return file->f_pos;
               break;
        }

        if (offset < 0 || offset > inode->i_sb->s_maxbytes)
                return -EINVAL;

        /* Special lock needed here? */
        if (offset != file->f_pos) {
                file->f_pos = offset;

                file->f_version = 0;
        }

        return offset;
}

Say, for example, I have a 4GB file, and I know the offset for the middle portion in the file, how exactly does a lseek() get me there without traversing the entire file? Does the OS already know where each byte of the file resides?

Community
  • 1
  • 1
Nehal J Wani
  • 16,071
  • 3
  • 64
  • 89

3 Answers3

7

lseek() as implemented in ext4 will just increment the file pointer and do some validation checks. It doesn't depend on the file size, meaning it is O(1).

Also you can see this in the code, there isn't any loop nor suspicious function calls in there.

However, while this is true on ext4, it might be not true for other filesystems, as this behaviour isn't guaranteed by POSIX. But it is likely unless the filesystem is meant for a very special purpose.

hek2mgl
  • 152,036
  • 28
  • 249
  • 266
  • Is this something that just happens to be true or is it something that's guaranteed to be true? – user541686 Feb 09 '14 at 11:36
  • 1
    Check the code you've posted. It's typical `O(1)` code – hek2mgl Feb 09 '14 at 11:37
  • 1
    @Mehrdad POSIX gives no guarantees concerning the complexity of FS operations, AFAIK. – Fred Foo Feb 09 '14 at 11:40
  • 1
    On a second thought, I can't tell if this question is about this particular seek function, or seeking in general... – user541686 Feb 09 '14 at 11:41
  • @Mehrdad Oh, I thought you where the OP. :) I think the question is about ext4, not POSIX. The ext4 code in the question shows a O(1) complexity. isn't it? – hek2mgl Feb 09 '14 at 11:42
  • @hek2mgl: Yeah, it seems like it's not a generic C question, never mind. – user541686 Feb 09 '14 at 11:43
  • it is tagged `ext4` :) – hek2mgl Feb 09 '14 at 11:44
  • 1
    I daresay that O(1) is indeed even **guaranteed** because `lseek` alone does not really do anything (except for modifying one integer). Though of course if the OS didn't readahead in between, a subsequent `read` might have to search through a list of allocated sectors first and might initiate a seek on a mechanical harddrive, which is O(N) in respect to the distance. But then again, `lseek` might as well _decrease_ the rotational delay (which one can't really predict). – Damon Feb 09 '14 at 13:05
  • @Damon yeah I'm with you. I guess even a filesystem for a tape device would not ffwd the tape itself on seek, the tape would get forwarded before the next read/write (or close). Everything other would not perform well. However I have no tape device (anymore) :) – hek2mgl Feb 09 '14 at 13:18
  • @hek2mgl Although I have tagged `ext4,` my question seeks an answer about seek() in general. So, even if the code seems to be `O(1)`, since its just manipulation of pointers, am I correct to say that the next `read()` or `write()` will have to play role in the complexity depending on the filesystem, which will also include latency on a hardware level? – Nehal J Wani Feb 09 '14 at 13:54
  • Yes, it will. You can open a `0` byte file, seek to `10000000` and write a byte. Now the file is `10000001` bytes large an when written the OS has to find proper memory regions for that file on disk. Finding the storage region(s) might get more complicated depending on the file's size, the used block size and the level of fragmentation of that device. Also the final, larger file has to be written to disk. However, I'm only talking about seeking above the size of a file... – hek2mgl Feb 09 '14 at 14:34
  • ... When seeking in the middle of the file, the next read/write will depend on whether the block is already in buffer cache or not, where is it physically located on disk and where is the current position of the head. Btw, I'm talking about hard disks currently. On a SD device or USB stick the sector hasn't to be moved of course. – hek2mgl Feb 09 '14 at 14:40
1

lseek's complexity depends on the representation of file in your system. On most modern systems a file is organized by some clever tree-like data structure resulting into seek being executed in time O(logx(n)), where n is the size of the file and x some system depending number.

Marian
  • 7,402
  • 2
  • 22
  • 34
  • 2
    While this is a clever and certainly true statement, technically I doubt that the actual search for the block takes place in lseek. I would assume the OS would only search the block at the point a read or write is issued (and maybe, readahead-like, asynchronously after the lseek). – Jonas Schäfer Feb 09 '14 at 11:27
  • 4
    `x` is irrelevant: logarithms of different bases only differ in a constant factor, and ordo notation does not consider constant multipliers. –  Feb 09 '14 at 11:29
  • @JonasWielicki Yeah, that's what I thought to when giving my answer. – hek2mgl Feb 09 '14 at 11:32
0

Yes, the OS already knows how to find any particular byte in the file.

No, it's not guaranteed to be O(1). The code you posted is O(1), but other filesystems' code might not be.

Yes, it will be fast enough, unless the OS or filesystem is terrible.

user253751
  • 57,427
  • 7
  • 48
  • 90