1

I'm kind of new to programming and I was writing a program to recover JPEG files present in "card.raw" by comparing the 4 continuous bytes. If they demarcated a JPEG the program had to copy a block of 512 bytes into a new file saved a xxx.jpg (000.jpg, 001.jpg, etc). If after the block had been copied, the start of a new JPEG was found, the current file would be closed and the next file would be opened to copy the next JPG. Else the next block would be copied in the same file.

Code (UPDATED):
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <errno.h>
#include <unistd.h>




int main(int argc, char *argv[])
{
    int counter = 0;
    if(argc != 2)
    {
        printf("Usage: ./recover image\n");
        return 1;
    }
    typedef uint8_t BYTE;
    FILE *recover = fopen(argv[1], "r");
    if(recover == NULL)
    {
        printf("ERRNO 1 IS %s\n", strerror(errno));
        exit(EXIT_FAILURE);
    }
    printf("ERRNO 1 IS %s\n", strerror(errno));
    fseek(recover, 0L, SEEK_END);
    long long int size = ftell(recover);
    BYTE *CHUNK = (BYTE*)malloc(size * sizeof(BYTE));
    fread(CHUNK, sizeof(BYTE), size, recover);      //Break recover into bytes
    int j = 0;
    char *file = NULL;
    FILE *f = NULL;
    int s = 0;
    write1:
    while(j + 3 < size)
    {
        if(CHUNK[j] == 0xff && CHUNK[j + 1] == 0xd8 && CHUNK[j + 2] == 0xff && j + 512 < size)                           //Check if byte is JPEG format 1st byte
        {
                    switch(CHUNK[j + 3])               //Check if byte is JPEG format 1st byte
                    {
                        case 0xe0:
                        case 0xe1:
                        case 0xe2:
                        case 0xe3:
                        case 0xe4:
                        case 0xe5:
                        case 0xe6:
                        case 0xe7:
                        case 0xe8:
                        case 0xe9:
                        case 0xea:
                        case 0xeb:
                        case 0xec:
                        case 0xed:
                        case 0xee:
                        case 0xef:
                        {
                            if(s == 0)
                            {
                                if(f != NULL)
                                {
                                    f = NULL;
                                }
                                sprintf(file ,"%03d.jpg",counter);   //Create custom file of format xxx.jpg
                                f = fopen(file,"w");
                                printf("ERRNO 2 IS %s\n", strerror(errno));
                                if(f == NULL)
                                {
                                    printf("ERRNO 2 is %s\n", strerror(errno));
                                    exit(EXIT_FAILURE);
                                }
                                //printf("ERRNO 2 IS %s\n", strerror(errno));
                                for(int k = 0; k < 512; k++)
                                {
                                    fwrite(&CHUNK[j + k], 512, sizeof(BYTE), f);  //Copy 512 bytes from start of JPEG file as 512 bytes form one 'block'
                                }
                                j += 512; //Increment to check initial bytes of next 'block'
                                s++;
                            }
                            else
                            {
                                fclose(f);
                                counter++;
                                goto write1;
                            }
                        }
                        default : goto write2;
                    }
        }
        else    //Else continue searching
        {
            write2:
            for(int k = 0; k < 512; k++)
            {
                fwrite(&CHUNK[j + k], 512, sizeof(BYTE), f);  //Copy 512 bytes from start of JPEG file as 512 bytes form one 'block'
            }
            j += 512;
        }
    }
    fclose(recover);
}

This program gives me a segmentation fault. I tried using Valgrind to find why the fault was occurring but I have no idea of what Valgrind's error message says.

UPDATED ERRROR MESSAGE:

==217== Memcheck, a memory error detector
==217== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==217== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==217== Command: ./recover card.raw
==217== 
ERRNO 1 IS Success
==217== Warning: client switching stacks?  SP change: 0x1fff000490 --> 0x1ffd36e490
==217==          to suppress, use: --max-stackframe=29958144 or greater
==217== Invalid write of size 8
==217==    at 0x400A0A: main (recover.c:29)
==217==  Address 0x1ffd36e490 is on thread 1's stack
==217== 
==217== 
==217== Process terminating with default action of signal 11 (SIGSEGV)
==217==  Access not within mapped region at address 0x1FFD36E490
==217==    at 0x400A0A: main (recover.c:29)
==217==  If you believe this happened as a result of a stack
==217==  overflow in your program's main thread (unlikely but
==217==  possible), you can try to increase the size of the
==217==  main thread stack using the --main-stacksize= flag.
==217==  The main thread stack size used in this run was 8388608.
==217== Invalid write of size 8
==217==    at 0x4A2A650: _vgnU_freeres (in /usr/lib/valgrind/vgpreload_core-amd64-linux.so)
==217==  Address 0x1ffd36e488 is on thread 1's stack
==217== 
==217== 
==217== Process terminating with default action of signal 11 (SIGSEGV)
==217==  Access not within mapped region at address 0x1FFD36E488
==217==    at 0x4A2A650: _vgnU_freeres (in /usr/lib/valgrind/vgpreload_core-amd64-linux.so)
==217==  If you believe this happened as a result of a stack
==217==  overflow in your program's main thread (unlikely but
==217==  possible), you can try to increase the size of the
==217==  main thread stack using the --main-stacksize= flag.
==217==  The main thread stack size used in this run was 8388608.
==217== 
==217== HEAP SUMMARY:
==217==     in use at exit: 4,648 bytes in 2 blocks
==217==   total heap usage: 2 allocs, 0 frees, 4,648 bytes allocated
==217== 
==217== 552 bytes in 1 blocks are still reachable in loss record 1 of 2
==217==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==217==    by 0x5694EB9: __fopen_internal (iofopen.c:65)
==217==    by 0x5694EB9: fopen@@GLIBC_2.2.5 (iofopen.c:89)
==217==    by 0x400940: main (recover.c:17)
==217== 
==217== 4,096 bytes in 1 blocks are still reachable in loss record 2 of 2
==217==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==217==    by 0x56941FB: _IO_file_doallocate (filedoalloc.c:101)
==217==    by 0x56A43E8: _IO_doallocbuf (genops.c:365)
==217==    by 0x56A0D92: _IO_file_seekoff@@GLIBC_2.2.5 (fileops.c:960)
==217==    by 0x569DD48: fseek (fseek.c:36)
==217==    by 0x4009B4: main (recover.c:24)
==217== 
==217== LEAK SUMMARY:
==217==    definitely lost: 0 bytes in 0 blocks
==217==    indirectly lost: 0 bytes in 0 blocks
==217==      possibly lost: 0 bytes in 0 blocks
==217==    still reachable: 4,648 bytes in 2 blocks
==217==         suppressed: 0 bytes in 0 blocks
==217== 
==217== For counts of detected and suppressed errors, rerun with: -v
==217== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
Segmentation fault

I'd like to know why the segmentation fault occurs or atleast what Valgrind's error message means.

UPDATE: Program output:

ERRNO 1 IS Success
Segmentation fault

UPDATE: Changed CHUNK from BYTE* to BYTE.

UPDATE: Used malloc to declare CHUNK array to prevent stack overflow.

UPDATE: Changed fread(&CHUNK, sizeof(BYTE), size, recover); to fread(CHUNK, sizeof(BYTE), size, recover);to prevent stack problems.

UPDATE: Eliminated while(!(feof)) to prevent a seemingly infinite loop that was occurring.

MelPradeep
  • 35
  • 5
  • 4
    The segmentation violation occurs in `fseek`, because you pass a null pointer ("address 0x0") as file, which you have received from `fopen("image", "r")`. Your program should ensure that the file could actually be opened before using `recover`. – M Oehm Nov 27 '20 at 17:54
  • 2
    I know, this is not what you have asked about, but this if/else ladder with nested `goto`s is crazy. You should refactor your logic heavily. During this refactoring there is a good chance you will solve your problem yourself. – Eugene Sh. Nov 27 '20 at 17:59
  • Always check the return value of `fopen()`. – 12431234123412341234123 Nov 27 '20 at 19:18
  • @MOehm I tried changing the line from `FILE *recover = NULL;` to `FILE *recover = (FILE*)malloc(sizeof(argv[1]));` and freeing the memory at the program's end but the seg fault still persists. – MelPradeep Nov 28 '20 at 07:28
  • @EugeneSh. I tried combining all the if s into one if statement but the error still persists – MelPradeep Nov 28 '20 at 07:36
  • 1
    That's not correct. You allocate memory to `recover` and then lose it immediately by overwriting the handle to the allocated memory with the result of `fopen`. You don't need to allocate memory here; `fopen` will do that for you and `fclose` will free it. Instead, test whether the result of `fopen` is `NULL`. If it is, you cannot use it in file operations such as `fseek` , `fread`, `fprintf` and so on. Print an error message and abort. Make the error message explicit, by including the filname and the error code. (Look at `perror`, `erno`, `strerror` and `exit`.) – M Oehm Nov 28 '20 at 11:04
  • @MOehm Just checked what `fopen` returns. It returns `0x2455260`. So not `NULL`.. – MelPradeep Nov 28 '20 at 11:10
  • 1
    Interesting. The Valgrind output suggests that you call `fseek` with a null pointer. `fseek` does not take any pointer arguments beside the file handle, so I assumed that `recover` was null. The first message already complains about something in `fopen`, which could mean that an internal pointer of the file handle is null when it shouldn't be, but I doubt it. – M Oehm Nov 28 '20 at 11:25
  • You say that the images are in a file called `card.raw`, but in the valgrind call, you pass `image`. – M Oehm Nov 28 '20 at 11:27
  • @MOehm I get the same message however, when I pass `card.raw`. – MelPradeep Nov 28 '20 at 15:36
  • And these files (`card.raw` and/or `image`) exist and are regular files? "I've tried that -- doesn't work." is really not such a good way of analysing the problem. Did you implement my suggestions above with en explicit error message? Could you try to run the code in a debugge, gdb prhaps, and check the values? – M Oehm Nov 28 '20 at 15:40
  • @MOehm So, I added if statements to check if any `fopen` returns a `NULL` and also `perror` and `strerror` statements after all `fopen` s. I've tried streamlining the code and have updated it along with the output after running it. – MelPradeep Nov 28 '20 at 16:37
  • @MOehm I guess the problem is with `f = fopen(file,"w");` because the program outputs only `ERRNO 1`. Also does `strerror` giving `Success` actually mean that `FILE *recover = fopen(argv[1], "r")` works correctly? – MelPradeep Nov 28 '20 at 16:47
  • @EugeneSh. I've eliminated most of the if cases and have kept one goto. Thanks for helping improve my code! – MelPradeep Nov 28 '20 at 16:50
  • 1
    Hm. In your listing, you have ´ERRNO 1` twice. If the file can be opened, a non-null handle is returned and `errno` isn't set. (I don't know whether it is explicitly set to zero.) Otherwise, you get a null pointer and can look at `errno` for more information. You open the file in "r" mode, not "w", which is correct. The file opening code looks okay to me. (But you must `rewind` the file before reading, because you are at its end. And there are other errors. For example, you want an array of bytes, not of pointers to byte.) – M Oehm Nov 28 '20 at 16:54
  • `main (recover.c:16)` that's what valgrind says about the location of the problem. What is line 16 in your file? The code you are running is probanly not the code you have posted, because the latter has `fopen` further than line 16. – n. m. could be an AI Nov 28 '20 at 18:08
  • Running a debugger shows that the program does not progress beyond `while(!(feof(recover)))`. Valgrind stays stuck after printing `ERRNO 1` until I terminate the program. I think it somehow becomes an infinite loop. Any idea why? – MelPradeep Nov 29 '20 at 03:25
  • 1
    @MelPradeep You wrote "*UPDATE: Changed CHUNK from *BYTE to BYTE*" but the code still shows `BYTE *CHUNK[size];` so, which one? Also, how big is the file? The `CHUNK` array get allocated on the stack, where room is limited. – dxiv Nov 29 '20 at 04:01
  • @dxiv I've changed it now. The file size for `card.raw` is not given. – MelPradeep Nov 29 '20 at 04:58

1 Answers1

0

Some problems noted below, in no particular order.

  •   BYTE CHUNK[size];
    

    [ EDIT ]   This was fixed in OP's subsequent edit.  This allocates size bytes on the stack, where size is the filesize of the argv[1] file. If that filesize is too large, the CHUNK array can easily overflow the default stack allocation. See why not to Declare large array on Stack.

  • while(j < size)
    {
      if(CHUNK[j] == 0xff && CHUNK[j + 1] == 0xd8 && CHUNK[j + 2] == 0xff)
      {
        switch(CHUNK[j + 3])
    

    The loop counter j goes all the way to the end of the CHUNK buffer, but the if statement also reads CHUNK[j + 1], CHUNK[j + 2] then CHUNK[j + 3]. When j = size - 1 for example, all three reads go past the end of the array and cause undefined behavior.

  • write:
    while(j < size)
    {
      ...
                  else
                  {
                    counter++;
                    goto write;
                  }
    

    There is a code path which jumps back to the write: label without modifying j, which causes an infinite loop (not the only such path, the implicit default case of the switch is another).

  •   else  //Else continue searching
      {
        for(int k = 0; k < 512; k++)
        {
          fwrite(&CHUNK[j + k], 512, sizeof(BYTE), f);
    

    There is a code path where this fwrite is reached with f == NULL. Even when f is not NULL, reading 512 bytes starting from &CHUNK[j + k] can potentially go past the end of the array, which is UB again.


[ EDIT ]   A few additional notes.

  •   char *file = NULL;
      ...
            sprintf(file ,"%03d.jpg",counter);
    

    The file pointer is never allocated, so sprintf will write to a NULL pointer, resulting in UB.

  •   long long int size = ftell(recover);
    

    ftell returns a long so there is no need to use a long long to store it.

  •   fseek(recover, 0L, SEEK_END);
      long long int size = ftell(recover);
    

    The return values of fseek and ftell are neither checked, so any errors will pass unnoticed.

dxiv
  • 16,984
  • 2
  • 27
  • 49
  • So, using `BYTE *CHUNK[size]` would be much better at preventing overflow? – MelPradeep Nov 29 '20 at 06:38
  • 1
    @MelPradeep No, that would allocate an array of `size` *pointers* on the stack, which is even bigger than `size` bytes (usually by a factor of 4 or 8). To allocate `size` bytes dynamically from the heap you would use `BYTE *CHUNK = malloc(size * sizeof(BYTE))`, then remember to `free` it once done. – dxiv Nov 29 '20 at 06:42
  • Thanks. I'll use `malloc` to dynamically create an array of `BYTES` – MelPradeep Nov 29 '20 at 07:11
  • 1
    @MelPradeep `fread(&CHUNK, ...` But `CHUNK` is a pointer now after your latest change, the array is no longer on the stack, and this is thrashing the stack badly. Change it to `fread(CHUNK, ...`. – dxiv Nov 29 '20 at 07:47
  • Thanks for that. So, now `fread` stores the data to the address pointed by a `CHUNK` pointer, right? – MelPradeep Nov 29 '20 at 09:08
  • @MelPradeep Right, as it should. – dxiv Nov 29 '20 at 09:10
  • 1
    @MelPradeep That loop is not needed, since you read the whole file in one shot (and, on the offchance that the file gets modified while you read it, you'll overwrite (part of) the same buffer on the second iteration). That said, I am not convinced that "*the program does not go past*" it. Don't know how you determined that, and if you debug an optimized build then that's somewhat tricky to follow. – dxiv Nov 29 '20 at 09:14
  • I tried writing print statements after that loop and also set breakpoints to the next line in the debugger but neither worked. It is possible that the loop executes a very large number of times (finite) but I doubt it. – MelPradeep Nov 29 '20 at 09:22
  • @MelPradeep Doubt that as well. It's hard to guess what *else* is going on without a fully duplicatable case, including the input file. – dxiv Nov 29 '20 at 09:41
  • I deleted that loop. So `fread` automatically stores the file's bytes as `BYTE` sized elements of `CHUNK[]`? – MelPradeep Nov 29 '20 at 09:47
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/225274/discussion-between-melpradeep-and-dxiv). – MelPradeep Nov 29 '20 at 09:47