0

I am trying to write a function that determine if a file in a directory is of a gif/bmp/png/jpg extension. Right now I think I have correctly written my code all the way up to listing the files in the directory and opening them in binary mode.

Now, I am struggling with figuring out how to determine what extension the image is. Right now i am just focusing on writing my "bool isGif();" function... To determine if a file is a .gif extension using binary, the first 6 bytes of the file will contain either GIF87a or GIF89a. So, to do this I would read the first six bytes of the file into an array, and then compare those to arrays that contain "GIF87a" or "GIF89a", correct?

Below is my attempt at coding this up. It gives me 2 warning, but no errors and it runs in the program fine, but it never outputs a message that directory contains a gif, and I know it does, because I put it in there...

getDir();

ifstream fin;

_finddata_t a_file;
intptr_t dir_handle;

dir_handle = _findfirst("*.*", &a_file);

//if (dir_handle == -1)
//{
    //return;
//}

while (_findnext(dir_handle, &a_file) == 0);
{
    fin.open(a_file.name, ios::in | ios::binary);

    if (!fin)
    {
        cout << endl << "Could not open the file."
            << " Attempting to open the next file." << endl;
        return false;
    }
    else
    {
        cout << "Files opened successfully."
            << " Processing through the directory." << endl;


            ifstream fl(a_file.name);
            fl.seekg(0, ios::end);
            size_t len = fl.tellg();
            char *ret = new char[len];
            fl.seekg(0, ios::beg);
            fl.read(ret, len);
            fl.close();

            char arr1[6] = { 'G', 'I', 'F', 8, 7, 'a' };
            char arr2[6] = { 'G', 'I', 'F', 8, 9, 'a' };

            if (ret == arr1 || arr2 )
            {
                cout << a_file.name << " has a .gif extension" << endl;
                return true;
            }


    }
}

Okay, I think I am close on this now... This is the updated/changed snippet of code important to this problem... I am just trying to use a for loop to read the first 6 bytes in to a string so I can compare the bits to determine if it is a gif, but I can't get the bytes in to a string.

int i;
            int comp1, comp2;

            for (i = 0; i != 6; i++)
            {
                string gifStr;
                fin.read((char*)&a_file, i);

                gifStr(&a_file, i);
            }

            string gifStr1 = "GIF87a";
            string gifStr2 = "GIF89a";

            comp1 = strcmp( , gifStr1);

            if (comp1 == 0)
            {
                cout << a_file.name << " has a .gif extension" << endl;
            }

            comp2 = strcmp( , gifStr2);

            if (comp2 == 0)
            {
                cout << a_file.name << " has a .gif extension" << endl;
            }   

Sorry, this website confuses me a little bit on responses and things like that... Haha.

Adam Chally
  • 69
  • 1
  • 8
  • It's not possible in general. You'll need to have magic number signatures scanning. – πάντα ῥεῖ Oct 13 '14 at 23:13
  • How can it not be possible? Can't you just read the individual bytes into an array and compare what those bytes are to another array of the characters it needs to be to see if they are equal? This is an assignment in a CSC 250 class, so it shouldn't have too difficult of a solution. – Adam Chally Oct 13 '14 at 23:18
  • 1
    _"... so it shouldn't have too difficult of a solution"_ Well, good luck then :-P ... – πάντα ῥεῖ Oct 13 '14 at 23:20
  • Unrelated: How many times were you thinking you needed to open that file? I would think once would be enough. – WhozCraig Oct 13 '14 at 23:58
  • 1
    Your code leaks memory and `ret == arr1` does not compare the text content of the arrays. Use std::string. Don't use strcmp with std::string, use operator == – Neil Kirk Oct 14 '14 at 00:22
  • You must be looking at some of my old code, still. I updated my original post with the code that I changed to, which no longer contains the "ret" variable. – Adam Chally Oct 14 '14 at 00:31

3 Answers3

1

You can look up the magic numbers for every image type you want.. then compare them (sort of) like below.. which only has a few of the magic numbers.. I wrote this when C++0x first came out.. There's probably a better way but the below should give a rough idea..

int ValidImage(std::uint8_t* ImageBytes)
{
    const static std::vector<std::uint8_t> GIFBytesOne = { 0x47, 0x49, 0x46, 0x38, 0x37, 0x61 };
    const static std::vector<std::uint8_t> GIFBytesTwo = { 0x47, 0x49, 0x46, 0x38, 0x39, 0x61 };
    const static std::vector<std::uint8_t> PNGBytes = { 0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A };
    const static std::vector<std::uint8_t> BMPBytes = { 0x42, 0x4D };
    const static std::vector<std::uint8_t> JPGBytes = { 0xFF, 0xD8, 0xFF };
    const static std::vector<std::uint8_t> JPEGBytes = { 0x00, 0x00, 0x00, 0x0C, 0x6A, 0x50, 0x20, 0x20 };
    const static std::vector<std::uint8_t> TIFFMonoChrome = { 0x0C, 0xED };
    const static std::vector<std::uint8_t> TIFFOne = { 0x49, 0x20, 0x49 };
    const static std::vector<std::uint8_t> TIFFTwo = { 0x49, 0x49, 0x2A, 0x00 };
    const static std::vector<std::uint8_t> TIFFThree = { 0x4D, 0x4D, 0x00, 0x2A };
    const static std::vector<std::uint8_t> TIFFFour = { 0x4D, 0x4D, 0x00, 0x2B };
    const static std::vector<std::uint8_t> CompressedTGA = {0x0, 0x0, 0xA, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0};
    const static std::vector<std::uint8_t> DeCompressedTGA = {0x0, 0x0, 0x2, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0};

    const static std::array<std::vector<std::uint8_t>, 13> All = {
        GIFBytesOne, GIFBytesTwo, PNGBytes, BMPBytes,
        JPGBytes, JPEGBytes, TIFFMonoChrome, TIFFOne,
        TIFFTwo, TIFFThree, TIFFFour, CompressedTGA,
        DeCompressedTGA
    };

    int I = 0;
    for (const auto& it : All)
    {
        if (std::equal(it.begin(), it.end(), ImageBytes))
            return I;
        ++I;
    }
    return -1;
}

Then:

    std::fstream hFile(FilePath, std::ios::in | std::ios::binary);

    if (!hFile.is_open())
    {
        throw std::invalid_argument("File Not Found.");
    }

    std::uint8_t Header[18] = {0};
    hFile.read(reinterpret_cast<char*>(&Header), sizeof(Header));
    hFile.seekg(0, std::ios::beg);

    IMAGE_TYPE type = ValidImage(Header);

where IMAGETYPE is defined as:

enum IMAGE_TYPE {GIF = 0, PNG, BMP, JPG, JPEG, TIFF, TGA};
Brandon
  • 22,723
  • 11
  • 93
  • 186
  • Why do you pass the vector to your lambda by value and not reference? Why not use `std::equal`? Why do you copy the vector in your range-based for loop? – Neil Kirk Oct 14 '14 at 00:23
  • @NielKirk; Code is old? I don't know. I never bothered to fix it. It's 3 years old (written in 2011). At the time, I was pretty bad and only interesting in getting things working and testing out C++0x. Also, I had ported that from C# which was probably a really bad idea lol. I don't know.. I don't mind getting a downvote for bad code. I just wanted to show the OP an example of how it could be done using magic numbers. – Brandon Oct 14 '14 at 01:24
  • That is better but I can still find improvements! – Neil Kirk Oct 14 '14 at 01:49
  • ImageBytes is not modified so should be const. Your vectors could be std::arrays to use memory more efficiently. Either combine all your sub-arrays and All array into one giant initialization statement, or make the type of All a reference of the vectors (or arrays) so that they are not copied and stored in memory twice. You can use `const auto&` – Neil Kirk Oct 14 '14 at 01:52
  • Can't use `std::array`'s because it requires I put the specific size.. so when I try to do: `std::vector> All = {...};` what do you specify for size? :S I'll fix the `const`. – Brandon Oct 14 '14 at 01:59
  • Ah yes that is a bugger – Neil Kirk Oct 14 '14 at 02:00
  • 1
    http://stackoverflow.com/questions/26351587/how-to-create-stdarray-with-initialization-list-without-providing-size – Neil Kirk Oct 14 '14 at 02:05
0

The culprit is here:

if (ret == arr1 || arr2 )

you cannot test char arrays like that for equality. Also - the test itself is incorrect. First of - if it would be possible to check like that - you would have to change it to:

if (ret == arr1 || ret == arr2 )

However that still won't do it, you have to do one of the following:

  • convert ret, arr1 and arr2 to std::string
  • use strcmp
  • test the arrays 1 char at a time, in a loop

From your comments and edit to the question, the best thing you could do here is read about strings. Maybe even view some documentation.

Paweł Stawarz
  • 3,952
  • 2
  • 17
  • 26
  • Okay, I think I'm really close with this. I changed my code around a bit. So, now I'm just wondering how I could read the bytes in to a string, because it won't let me do the straight up assignment in the for loop... `for (i = 0; i != 6; i++) { string gifStr; gifStr = fin.read((char*)&a_file, i); } string gifStr1[7] = "GIF87a"; string gifStr2[7] = "GIF89a"; comp1 = strcmp( , gifStr1); if (comp1 == 0) { //confirmation message }` – Adam Chally Oct 14 '14 at 00:07
  • `std::string ret_s(ret,6), arr1_s(arr1,6), arr2_s(arr2,6);`. Then `if(ret_s == arr1_s || ret_s == arr2_s)`. Or even better - create the strings like `arr1_s = "GIF87a"`, etc. You need to `#include` in order to work with strings. Strings are located inside the `std` namespace. You don't need to create `gifStr2[7]`. `gifStr2` will suffice. – Paweł Stawarz Oct 14 '14 at 00:09
  • So are you saying that I don't need to do a for loop? I just do `gifStr_s(fin,6)` and then compare it to `gifStr1_s = "GIF87a";` and `gifStr2_s = "GIF89a";`? – Adam Chally Oct 14 '14 at 00:34
  • @AdamChally nearly yes. However you can't `read` directly into a `string`. Read just like now (`fl.read(ret, len);`), just after reading, convert the contents into a `std::string`, by doing `std::string gifStr(ret,6);`. Really read about strings, the more you code, the more you need them. Check the links in my post for resources. – Paweł Stawarz Oct 14 '14 at 00:40
  • Okay, I got rid of the for loop and did this `fin.read((char*)&a_file, 6);` which works, but then when I try to read it in to a string like this `string gifStr;, gifStr(&a_file, 6);` it says "IntelliSense: call of an object of a class type without appropriate operator() or conversion functions to pointer-to-function type" – Adam Chally Oct 14 '14 at 01:39
  • If you read my original question, I edited it and posted code underneath it that is a lot closer to what I actually have now... – Adam Chally Oct 14 '14 at 02:05
  • @AdamChally you don't need to use `strcmp` for `string`. Also - you should closely look into the comments. Your last comment is wrong (hint: you don't need to take the address of the table), and try to understand what's going on, instead of blindly repeating. A good C++ book would help you there more than asking me or anyone else. – Paweł Stawarz Oct 14 '14 at 12:31
0

The trouble with the following code is it loads the whole file into memory, even though you only want to check a few bytes. This is wasteful but left as an exercise.

ifstream fl(a_file.name);
fl.seekg(0, ios::end);
vector<char> ret(fl.tellg());
fl.seekg(0, ios::beg);
fl.read(&ret[0], ret.size());
fl.close();

static const vector<string> gif_ids = { "GIF87a", "GIF89a" };
bool is_gif = false;
for (const auto& id : gif_ids)
{
    // check size first because the file may contain less data than the id
    if (ret.size() >= id.size() && std::equal(id.begin(), id.end(), ret.begin()))
    {
        // it's a gif!
        is_gif = true;
        break;
    }
}
Neil Kirk
  • 21,327
  • 9
  • 53
  • 91