0

I am writing my first kernel module, and noticed the initially strange behavior surrounding the 'one cat, two read calls'. Doing some research, it seems to be how files are read; read() calls until read() returns zero, which makes sense. 1 2

This to me only raises more questions though:

  1. Why is only one response returned from cat? How does cat receive a 0? If my code always returns something on it's read() function, how can a program like cat ever read 0? The second (and so on) calls would always return data, as there is nothing in the module code to return 0.
  2. how can I program a module to ensure data safety? Assume my module is a producer, and the system I am building needs data integrity to ensure each read() call provides the next bit of data. If a read() call inherently ends up needing to call at least twice to get 0 back as EOF (between individual records), how can I ensure integrity? This might show my lack of knowledge on available options, and not knowing what I don't know sort of thing.
  3. It is known to never trust you can get an entire stream of data in one read() call in C, which is why we read until EOF. Yet the semantics of how the module's read() function is called suggest that something (kernel?) should be managing that data transfer, and inserting an EOF for the module (to be clear, my thought here is that the module writes the buffer to kernel, kernel can chunk, whatever as it passes back to user, and EOF is inserted by kernel). This obviously isn't the case. Any notes on this idea?

Code:

static unsigned long numCalls = 0;

static ssize_t custom_read(struct file* file, char __user* user_buffer, size_t count, loff_t* offset){
        printk(KERN_INFO "calling our very own custom read method.");

        char* output = kasprintf(GFP_KERNEL, "Hello world! Read No: %ld\n", ++numCalls);

        if(!output){
                return -ENOMEM;
        }

        int outputLen = strlen(output);

        if (*offset > 0){
                kfree(output);
                return 0;
        }

        copy_to_user(user_buffer, output, outputLen);
        kfree(output);
        *offset = outputLen;
        return outputLen;
}

Which when loaded/ run:

$ cat /proc/helloworlddriver 
Hello world! Read No: 1
$ cat /proc/helloworlddriver 
Hello world! Read No: 3

Verification from dmesg that the function is being called twice for every cat:

$ dmesg
[ 6976.958236] calling our very own custom read method.
[ 6977.156778] calling our very own custom read method.
[ 6977.156892] calling our very own custom read method.
[ 6977.369510] calling our very own custom read method.

Snappawapa
  • 1,697
  • 3
  • 20
  • 42
  • `Why is only one response` What do you mean by "response"? – KamilCuk Jan 10 '22 at 21:18
  • @KamilCuk Added detail to show that – Snappawapa Jan 10 '22 at 21:24
  • Och, you are asking why `cat` _when working with your module_ works? Do you understand what `if (*offset > 0){` does? – KamilCuk Jan 10 '22 at 21:29
  • I'm asking deeper questions on how/why it works the way it does, and how to properly operate with it. My module seems to work as intended/ coded, but the behavior seems strange from a newbie like myself. I suppose I do not! – Snappawapa Jan 10 '22 at 21:32
  • 1
    On the first call, `*offset` is zero, and gets incremented - `*offset = outputLen;`. Then, on the second call, `*offset` is greater than zero, so your code does `return 0;`. There's nothing really much more to it.. – KamilCuk Jan 10 '22 at 21:33
  • Ah, that does make sense, I was unaware of the semantics there. That makes for some interesting implementation, and how one could split up data of a read. If you make an answer, I'll mark it green – Snappawapa Jan 10 '22 at 21:38

0 Answers0