0

I have a DLL written in C++ that wraps FindFirstFile/FindNextFile/FindClose to provide a file-search function:

std::vector<std::wstring> ELFindFilesInFolder(std::wstring folder, std::wstring fileMask = TEXT(""), bool fullPath = false);

This function returns a std::vector containing a list of filenames within the given folder matching the given filemask. So far so good; the function works as expected.

I need to write a C wrapper around this library, though, because I can't pass a vector across DLL boundaries. This is leading to no end of headaches.

I initially thought I would just set up a function that would receive a two-dimensional wchar_t array, modify it to contain the filename list, and return it:

bool ELFindFilesInFolder(const wchar_t* folderPath, const wchar_t* fileMask, const bool fullPath, wchar_t* filesBuffer[], size_t* filesBufferSize);

This proved to be a bad idea, however, as at least the second dimension's size has to be known at compile-time. I suppose I could just force the caller to make the second dimension MAX_PATH (so the function would receive a variable-length list of filename buffers, each MAX_PATH long), but this seems messy to me.

I considered a wrapper in the style of the Windows APIs:

bool ELFindNextFileInFolder(const wchar_t* folderPath, const wchar_t* fileMask, const bool fullPath, wchar_t* fileBuffer, size_t* fileBufferSize, HANDLE* searchToken);

This would perform the search, return the first filename found, and save the search handle provided by FindFirstFile. Future calls to ELFindNextFileInFolder would provide this search handle, making it easy to pick up where the last call left off: FindNextFile would just get the saved search handle. However, such handles are required to be closed via FindClose, and C doesn't seem to have the C++ concept of a smart pointer so I can't guarantee the searchToken will ever be closed. I can close some of the HANDLEs myself when FindNextFile indicates there are no more results, but if the caller abandons the search before that point there'll be a floating HANDLE left open. I'd very much like my library to be well-behaved and not leak HANDLEs everywhere, so this is out. I'd also prefer not to provide an ELCloseSearchHandle function, since I'm not sure I can trust callers to use it properly.

Is there a good, preferably single-function way to wrap these Windows APIs, or am I simply going to have to pick one from a list of imperfect solutions?

Community
  • 1
  • 1
cf-
  • 8,598
  • 9
  • 36
  • 58

2 Answers2

2

What about something like this?

In the DLL module:

#include <windows.h>
#include <vector>
#include <unordered_map>

unsigned int global_file_count; //just a counter..
std::unordered_map<unsigned int, std::vector<std::wstring>> global_file_holder; //holds vectors of strings for us.


/** Example file finder C++ code (not exported) **/
std::vector<std::wstring> Find_Files(std::wstring FileName)
{
    std::vector<std::wstring> Result;
    WIN32_FIND_DATAW hFound = {0};
    HANDLE hFile = FindFirstFileW(FileName.c_str(), &hFound);
    if (hFile != INVALID_HANDLE_VALUE)
    {
        do
        {
            Result.emplace_back(hFound.cFileName);
        } while(FindNextFileW(hFile, &hFound));
    }
    FindClose(hFile);
    return Result;
}


/** C Export **/
extern "C" __declspec(dllexport) unsigned int GetFindFiles(const wchar_t* FileName)
{
    global_file_holder.insert(std::make_pair(++global_file_count, Find_Files(FileName)));
    return global_file_count;
}

/** C Export **/
extern "C" __declspec(dllexport) int RemoveFindFiles(unsigned int handle)
{
    auto it = global_file_holder.find(handle);
    if (it != global_file_holder.end())
    {
        global_file_holder.erase(it);
        return 1;
    }
    return 0;
}

/** C Export **/
extern "C" __declspec(dllexport) const wchar_t* File_Get(unsigned int handle, unsigned int index, unsigned int* len)
{
    auto& ref = global_file_holder.find(handle)->second;

    if (ref.size() > index)
    {
        *len = ref[index].size();
        return ref[index].c_str();
    }

    *len = 0;
    return nullptr;
}

/** C Export (really crappy lol.. maybe clear and reset is better) **/
extern "C" __declspec(dllexport) void File_ResetReferenceCount()
{
    global_file_count = 0;
    //global_file_holder.clear();
}

extern "C" __declspec(dllexport) bool __stdcall DllMain(HINSTANCE hinstDLL, DWORD fdwReason, void* lpvReserved)
{
    switch (fdwReason)
    {
        case DLL_PROCESS_ATTACH:
            DisableThreadLibraryCalls(hinstDLL);
            break;

        case DLL_PROCESS_DETACH:
            break;

        case DLL_THREAD_ATTACH:
            break;

        case DLL_THREAD_DETACH:
            break;
    }
    return true;
}

Then in the C code you can use it like:

#include <stdio.h>
#include <stdlib.h>
#include <windows.h>

int main()
{
    HMODULE module = LoadLibrary("CModule.dll");
    if (module)
    {
        unsigned int (__cdecl *GetFindFiles)(const wchar_t* FileName) = (void*)GetProcAddress(module, "GetFindFiles");
        int (__cdecl *RemoveFindFiles)(unsigned int handle) = (void*)GetProcAddress(module, "RemoveFindFiles");
        const wchar_t* (__cdecl *File_Get)(unsigned int handle, unsigned int index, unsigned int* len) = (void*)GetProcAddress(module, "File_Get");
        void (__cdecl *File_ResetReferenceCount)() = (void*)GetProcAddress(module, "File_ResetReferenceCount");


        unsigned int index = 0, len = 0;
        const wchar_t* file_name = NULL;
        unsigned int handle = GetFindFiles(L"C:/Modules/*.dll"); //not an actual handle!

        while((file_name = File_Get(handle, index++, &len)) != NULL)
        {
            if (len)
            {
                wprintf(L"%s\n", file_name);
            }
        }

        RemoveFindFiles(handle); //Optional..
        File_ResetReferenceCount(); //Optional..

        /** The above two functions marked optional only need to be called 
            if you used FindFiles a LOT! Why? Because you'd be having a ton
            of vectors not in use. Not calling it has no "leaks" or "bad side-effects".
            Over time it may. (example is having 500+ (large) vectors of large strings) **/

        FreeLibrary(module);
    }

    return 0;
}

It seems a bit dirty to be honest but I really don't know any "amazing" ways of doing it. This is just the way I do it.. Most of the work is done on the C++ side and you don't really have to worry about leaks.. Even exporting a function to clear the map would be nice too..

It would be better if the C exports were added to a template class and then export each of those.. That would make it re-useable for most C++ containers.. I think..

Brandon
  • 22,723
  • 11
  • 93
  • 186
  • This does not solve the issue of the caller needing to close a handle when the loop is finished. This is exactly one of the scenarios the OP was trying to avoid. – Remy Lebeau Apr 12 '14 at 05:11
  • Close what handle? That isn't an actual handle :S That `handle` in my code is just an int (I couldn't come up with a better name) which is an index in the map of vectors.. It's just used for lookup in the `unordered_map`. – Brandon Apr 12 '14 at 05:13
  • There I added some comments explaining it is not an actual handle and does not need closing or deleting (can optionally call `RemoveFindFiles` with it to manually clean up un-used `vectors`). Also added what parts are "optional" and what parts aren't.. Should be safe for OP. Hopefully. – Brandon Apr 12 '14 at 05:21
  • If the caller doesn't call `RemoveFindFiles()` then the memory is not necessarily "leaked" since the DLL reclaims it at unload, but the vector is still hanging around wasting memory. Until reused. It is the same difference between a memory leak and a handle leak. And this code is not thread safe, either. – Remy Lebeau Apr 12 '14 at 05:38
  • While that is true, and yes it is collected when the dll unloads, would you rather have the caller pass you a pointer, you allocate the memory for each string and at the end still have to remember to free it all? OR would you rather have some sort of safety net.. It's not like there is any amazingly better alternative with RAII. If you really need thread safe, what's stopping you from adding an `std::lock_guard` to search, add, and delete. You're finding things to complain about. It is `C`.. OP only has to remember to call remove if there is either a LOT of vectors or if the no longer need one – Brandon Apr 12 '14 at 05:44
  • If there really is a better alternative I'd like to know myself. For personal use as well. Anyway, I see your point, I really do, but I personally cannot think of anything else that is far better and safer atm :l Oh.. Did not see you answered OP as well.. :o Your last technique is similar to mine if not the exact same.. – Brandon Apr 12 '14 at 05:52
1

Change wchar_t* filesBuffer[] to wchar_t** *filesBuffer, then the caller can pass in a pointer to a wchar_t** variable to receive the array and does not need to know anything about any bounds at compile time. As for the array itself, the DLL can allocate a one-dimensional array of wchar_t* pointers that point to null-terminated strings. That way, your size_t* filesBufferSize parameter is still relevant - it receives the number of strings in the array.

bool ELFindFilesInFolder(const wchar_t* folderPath, const wchar_t* fileMask, const bool fullPath, wchar_t** *filesBuffer, size_t* filesBufferSize);

wchar_t **files;
size_t numFiles;
if (ELFindFilesInFolder(..., &files, &numFiles))
{
    for(size_t i = 0; i < numFiles; ++i)
    {
        // use files[i] as needed ...
    }
    // pass files back to DLL to be freed ...
}

Another option is to do something similar to WM_DROPFILES does. Have ELFindFilesInFolder() return an opaque pointer to an internal list, and then expose a separate function that can retrieve a filename at a given index within that list.

bool ELFindFilesInFolder(const wchar_t* folderPath, const wchar_t* fileMask, const bool fullPath, void** filesBuffer, size_t* filesBufferSize);

bool ELGetFile(const wchar_t* fileName, size_t fileNameSize, void* filesBuffer, size_t fileIndex);

void *files;
size_t numFiles;
wchar_t fileName[MAX_PATH + 1];
if (ELFindFilesInFolder(..., &files, &numFiles))
{
    for(size_t i = 0; i < numFiles; ++i)
    {
        ELGetFile(fileName, MAX_PATH, files, i);
        // use fileName as needed ...
    }
    // pass files back to DLL to be freed ...
}

Any way you do it, the DLL has to manage the memory, so you have to pass some kind of state info to the caller and then have that passed back to the DLL for freeing. There is no many ways around that in C, unless the DLL keeps track of the state info internally (but then you have to worry about thread safety, reentrancy, etc) and frees it after the last file is retrieved. But that would require the caller to reach the last file, whereas the other approaches allow the caller to finish earlier if desired.

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770