My comments, summarized:
fread
will do what it always does at the end of a "file" (in this case, disk) and return the number of read bytes, i.e., most likely "0" (if you are reading per 512 bytes).
EOF
is not a 'byte' value you should be looking for, rather, it indicates a state. Use feof
to explicitly test, or just check the return value of fread
.
Currently you are checking each and every single byte. But the data is not stored in any random order! USB sticks store data in sectors, each one 512 bytes long: "Sectors are 512 bytes long, for compatibility with hard drives" (wikipedia on USB flash drive).
You cannot assume contiguous sectors belong to the same file due to fragmentation. If a file is fragmented, there is no automatic way to automatically merge the sectors in the correct order ... (Doing it manually is usually out of the question. I'd consider doing that only if the original file contains easy recognizable data such as plain text, and the contents are extremely important :)
.)
You can read a sector -- 512 bytes -- and stop if you encounter EOF
. If this sector starts with the two signature bytes for a BMP, you can inspect it further to verify it is a BMP header, and if so, you can use the BMP structure data to check if all next sectors contain a valid BMP file. The only way to do so is:
- the first sector contains all relevant BMP metrics:
data size
indicates the original pixel size, and you should read that much extra data.
- using the BMP file specifications, check if:
- width times height times bytes per pixel equals total size
- data does not contain out-of-range values (not possible for 24 bit images, though)
- data is aligned to a DWORD per scan-line
If you accept the BMP as 'possible correct', you can save it to disk and verify by eye if it seems correct. Then:
- you are 100% sure this file is well-formed; or
- another image may start "inside" this one's data part due to fragmentation.
If it isn't a well-formed BMP image, or you want a thorough check of every sector, continue scanning with the next sector. If you are sure the image is well-formed throughout or you want to speed up scanning, you can skip (datasize+sectorsize-1)/sectorsize
sectors.
The simple C program below scans an entire disk and if it seems to indicate a BMP file start, it prints out the first 32 bytes in human readable form. For my test disk, it gave the following output:
42 4D D8 49 EE 0E E8 B9 7A BE F3 7C DF FD 7E F7 77 9F 7B FF 38 7F F0 3C 24 33 B3 66 AD 77 BD 6B | BM.I....z..|..~.w.{.8..<$3.f.w.k
42 4D 6E E6 E3 D3 48 37 A5 27 D7 6F EF 49 4E 13 E0 A7 DF 78 47 8E 5E 3C 95 B5 0A 16 D2 5C CE 3A | BMn...H7.'.o.IN....xG.^<.....\.:
42 4D 36 00 24 00 00 00 00 00 36 00 00 00 28 00 00 00 00 04 00 00 00 03 00 00 01 00 18 00 00 00 | BM6.$.....6...(.................
42 4D 49 2C 20 62 6F 64 79 20 6D 61 73 73 20 69 6E 64 65 78 3B 20 41 53 41 2C 20 41 6D 65 72 69 | BMI, body mass index; ASA, Ameri
42 4D 50 66 6F 67 6C 65 00 00 00 00 00 00 29 1E 00 01 DC F8 BC 84 91 AE BC 84 91 AE 00 04 00 00 | BMPfogle......).................
The weird thing is, initially it contained no BMP files so I copied one to test with. Now how come there are more than one candidates? (There were actually 9 more.) First, there are "false positives" -- the "BMI" one is a nice example --, but second: if there is a deleted BMP file somewhere on that disk and its first sector happens to not have been overwritten, it will also be listed!
Short & rough sample code:
#include <stdio.h>
int main (int argc, char **argv)
{
FILE *usb_ptr;
unsigned char buffer[512];
int i, j;
if (argc == 1)
{
printf ("wot no stick?\n");
return -1;
}
usb_ptr = fopen(argv[1],"rb");
if(usb_ptr == NULL)
{
printf("error opening USB Drive for reading");
}
i = 0;
while (1)
{
if (fread (buffer, 512,1, usb_ptr) < 1)
break;
i++;
if (!(i & 127))
printf ("%d sectors read..\r", i);
if (buffer[0] == 'B' && buffer[1] == 'M')
{
for (j=0; j<32; j++)
printf ("%02X ", buffer[j]);
printf ("| ");
for (j=0; j<32; j++)
{
if (buffer[j] >= ' ' && buffer[j] <= '~')
printf ("%c", buffer[j]);
else
printf (".");
}
printf ("\n");
}
}
fclose (usb_ptr);
return 0;
}
(Afterthought) It's pretty slow for a 1Gb disk .. perhaps it's faster to read more sectors at once. (Testing..) yup, way faster to read even as little as 10 sectors inside the loop.