1

I have a rather peculiar file format to work with: Every line begins with the checksum of its content, followed by a new-line-character.

It looks like this:

[CHECKSUM OF LINE_1][LINE_1]\n
[CHECKSUM OF LINE_2][LINE_2]\n
[CHECKSUM OF LINE_3][LINE_3]\n
...

My goal: To allow any application to work with these files like they would work with any other text file - unaware of the additional checksums at the beginning of each line.

Since I work on a linux machine with debian wheezy (kernel 3.18.26) I want to use the LD_PRELOAD-mechanism to override the relevant file functions. I have seen something like this with zlibc on https://zlibc.linux.lu/index.html - with an explanation of how it works ( https://zlibc.linux.lu/zlibc.html#SEC8 ).

But I dont get it. They only replace the file-opening functions. No read. No write. no fseek. Nothing. So how does it work? Or - which functions would I have to intercept to handle every read or write operation on this file and handle them accordingly?

Daniel Heinrich
  • 790
  • 4
  • 12

1 Answers1

1

I didn't exactly check how it works but the reason seems to be quite simple.

Possible implementation:

zlibc open:

  1. uncompress file you wanted to open to some temporary file
  2. open this temporary file instead of yours

zlibc close:

  1. Compress temporary file
  2. Override original file

In this case you don't need to override read/write/etc because you can use original ones.

In your case you have two possible solutions:

  1. open, that make a copy of your file with striped checksums. close that calculates checksums and override original file
  2. read and write that are able to skip/calculate checksums.

Ad 2. From What is the difference between read() and fread()?:

fread() is part of the C library, and provides buffered reads. It is usually implemented by calling read() in order to fill its buffer

In this case I believe that overriding open and close will be less error prone because you can safely reuse original read, write, fread, fseek etc.

Community
  • 1
  • 1
woockashek
  • 1,588
  • 10
  • 25
  • That does sound like a reasonable approach. Am I correct in my assumption, that all functions like `fopen`, `fclose`, `fgets`, `fputs`, `fread`, `fwrite`, etc... all are library functions which - at some point - use the system calls you mentioned? What about functions like `fseek` and `ftell`? Are there any other system calls I should intercept to make sure everything works fine? – Daniel Heinrich Nov 18 '16 at 06:53