2

What is wrong with this code (I guess with regard to compatibility with different kernel versions)?

#define BUFFER_SIZE 2048
#define BUFFER_STEP 128

static char buffer[BUFFER_SIZE] = { 0 }; // this is filled with repeating letters a-z in module init

static void* seqf_ex_start (struct seq_file* m, loff_t* pos)
{
    if (*pos >= BUFFER_SIZE) {
        return NULL;
    }

    return buffer + *pos;
}

static void* seqf_ex_next (struct seq_file* m, void* v, loff_t* pos)
{
    *pos += BUFFER_STEP;

    if (*pos >= BUFFER_SIZE) {
        return NULL;
    }

    return buffer + *pos;
}

static int seqf_ex_show (struct seq_file* m, void* v)
{
    seq_printf(m, "%.*s\n", BUFFER_STEP, (char*)v);

    return 0;
}

When running the kernel module in qemuarm (kernel v5.0.19), I get the correct output:

$ cat /proc/seqf-ex | awk '{ print "line" OFS NR ":" OFS length($0) }'
line 1: 128
line 2: 128
line 3: 128
line 4: 128
line 5: 128
line 6: 128
line 7: 128
line 8: 128
line 9: 128
line 10: 128
line 11: 128
line 12: 128
line 13: 128
line 14: 128
line 15: 128
line 16: 128

But when running on kernel v4.15.0, I get this:

$ cat /proc/seqf-ex | awk '{ print "line" OFS NR ":" OFS length($0) }'
line 1: 128
line 2: 128
line 3: 128
line 4: 128
line 5: 128
line 6: 128
line 7: 128
line 8: 128
line 9: 128
line 10: 128
line 11: 128
line 12: 128
line 13: 128
line 14: 128
line 15: 128
line 16: 128
line 17: 127
line 18: 126
line 19: 125
line 20: 124
line 21: 123
line 22: 122
line 23: 121
line 24: 120
line 25: 119
line 26: 118
line 27: 117
line 28: 116
line 29: 115
line 30: 114
line 31: 113
line 32: 112
line 33: 111
line 34: 110
line 35: 109
line 36: 108
line 37: 107
line 38: 106
line 39: 105
line 40: 104
line 41: 103
line 42: 102
line 43: 101
line 44: 100
line 45: 99
line 46: 98
line 47: 97
line 48: 96
line 49: 95
line 50: 94
line 51: 93
line 52: 92
line 53: 91
line 54: 90
line 55: 89
line 56: 88
line 57: 87
line 58: 86
line 59: 85
line 60: 84
line 61: 83
line 62: 82
line 63: 81
line 64: 80
line 65: 79
line 66: 78
line 67: 77
line 68: 76
line 69: 75
line 70: 74
line 71: 73
line 72: 72
line 73: 71
line 74: 70
line 75: 69
line 76: 68
line 77: 67
line 78: 66
line 79: 65
line 80: 64
line 81: 63
line 82: 62
line 83: 61
line 84: 60
line 85: 59
line 86: 58
line 87: 57
line 88: 56
line 89: 55
line 90: 54
line 91: 53
line 92: 52
line 93: 51
line 94: 50
line 95: 49
line 96: 48
line 97: 47
line 98: 46
line 99: 45
line 100: 44
line 101: 43
line 102: 42
line 103: 41
line 104: 40
line 105: 39
line 106: 38
line 107: 37
line 108: 36
line 109: 35
line 110: 34
line 111: 33
line 112: 32
line 113: 31
line 114: 30
line 115: 29
line 116: 28
line 117: 27
line 118: 26
line 119: 25
line 120: 24
line 121: 23
line 122: 22
line 123: 21
line 124: 20
line 125: 19
line 126: 18
line 127: 17
line 128: 16
line 129: 15
line 130: 14
line 131: 13
line 132: 12
line 133: 11
line 134: 10
line 135: 9
line 136: 8
line 137: 7
line 138: 6
line 139: 5
line 140: 4
line 141: 3
line 142: 2
line 143: 1

It seems that seqf_start function is called repeatedly after the seqf_next returns NULL. Why is this happening?

  • Maybe this one helps https://stackoverflow.com/q/25399112/1741542 – Olaf Dietsche Mar 06 '20 at 19:28
  • I have seen this, but it's not really related to my question. – samuzu.pazael Mar 06 '20 at 20:23
  • 1
    "It seems that `seqf_start` function is called repeatedly after the `seqf_next` returns NULL." - `.start` function is called **at least once** after `.next` returns NULL. In that case you need to return NULL from that `start` call to tell `cat` command that the file is really ends. Add `printk` statements into your `.start` function and see what **actually happens** and what your `.start` function **actually returns**. – Tsyvarev Mar 07 '20 at 00:21
  • 1
    Hint: You code works correctly only when `*pos` value is **divisible by** `BUFFER_STEP`. You expect this is always true (well, I would expect that too), as the only place you update the `*pos` is `seqf_ex_next` function, which increments `*pos` by `BUFFER_STEP`. However, [in Linux kernel 4.15](https://elixir.bootlin.com/linux/v4.15.18/source/fs/seq_file.c) there are places where `*pos` is incremented just by 1. Probably, you hit one of this place. Just add `printk` into your `seqf_ex_start` and `seqf_ex_next` functions for check **actual** value of `*pos`. – Tsyvarev Mar 07 '20 at 01:05
  • @Tsyvarev Hi, thanks! That is sort of behaviour I noticed and I managed to solve it by changing the check in the start function; checking if *pos + BUFFER_STEP is greater than BUFFER_SIZE instead of just *pos seems to fix it. I wondered however if there was something I was missing or doing wrong with regard to the API, because on kernel 5 it works exactly as I would expect it. Thanks – samuzu.pazael Mar 07 '20 at 08:27
  • Changing comparision to `*pos + BUFFER_STEP > BUFFER_SIZE` only hides your real problem: your `seqf_ex_show` function isn't ready to non-divisible values of `*pos`. So, would you use commands other than `cat` or `awk` for read your file, you would again face with incorrect behavior. See my answer for better fix of your code. – Tsyvarev Mar 07 '20 at 15:43
  • 2
    BTW, `cat | awk – Tsyvarev Mar 07 '20 at 15:46

1 Answers1

1

There is some incompatibility between Linux kernel 4.x and 5.x versions about handling *pos argument passed to seq_file callback functions.

Linux kernel 5.x only sets *pos to 0 and leave other modifications to the seq_file callbacks (.start and .next functions in struct seq_operations).

However, in Linux kernel 4.x *pos can be incremented by 1 outside of your callbacks. This is done if the last .show call produces exactly number of bytes needed to fulfill request of read syscall.

Because your code is ready only to *pos values which are divisible by BUFFER_STEP, it works incorrectly when *pos is incremented outside of your functions, so it is divisible by BUFFER_STEP no longer.


I don't know whether behavior of Linux 4.x is correct, but if you want your code to work with such behavior, you need to prepare to all values of *pos, not only to ones produced by your code.

Simple fix of your code would be interpret *pos as a chunk index instead of a byte index:

static void* seqf_ex_start (struct seq_file* m, loff_t* pos)
{
    if (*pos >= (BUFFER_SIZE/BUFFER_STEP)) {
        return NULL;
    }

    return buffer + (*pos * BUFFER_STEP);
}

static void* seqf_ex_next (struct seq_file* m, void* v, loff_t* pos)
{
    *pos += 1;

    if (*pos >= (BUFFER_SIZE/BUFFER_STEP)) {
        return NULL;
    }

    return buffer + (*pos * BUFFER_STEP);
}

static int seqf_ex_show (struct seq_file* m, void* v)
{
    seq_printf(m, "%.*s\n", BUFFER_STEP, (char*)v);

    return 0;
}

Such a way your code will work with all values of *pos. (Those which are too high would be cut off by checks if (*pos >= (BUFFER_SIZE/BUFFER_STEP))). And increment of *pos by 1 performed outside of your code is equivalent to calling your .next callback.

Tsyvarev
  • 60,011
  • 17
  • 110
  • 153