2

I have a legacy function accepting a FILE* pointer in a library. The contents I would like to parse is actually in memory, not on disk.

So I came up with the following steps to work around this issue:

  • the data is in memory at this point
  • fopen a temporary file (using tmpnam or tmpfile) on disk for writing
  • fclose the file
  • fopen the same file again for reading - guaranteed to exist
  • change the buffer using setvbuf(buffer, size)
  • do the legacy FILE* stuff
  • close the file
  • remove the temporary file
  • the data can be discarded

On windows, it looks like this:

int bufferSize;
char buffer[bufferSize];

// set up the buffer here

// temporary file name
char tempName [L_tmpnam_s];
tmpnam_s(tempName, L_tmpnam_s);

// open/close/reopen
fopen_s(&fp, tempName,"wb");
fclose(fp);
freopen_s(&fp, tempName,"rb", fp);

// replace the internal buffer
setvbuf(fp, buffer, _IONBF, bufferSize);
fp->_ptr = buffer;
fp->_cnt = bufferSize;

// do the FILE* reading here

// close and remove tmp file
fclose(fp);
remove(tempName);

Works, but quite cumbersome. The main problem, aside from the backwardness of this approach, are:

  • the temporary name needs to be determined
  • the temporary file is actually written to disk
  • the temporary file needs to be removed afterwards

I'd like to keep things portable, so using Windows memory-mapped functions or boost's facilities is not an option. The problem is mainly that, while it is possible to convert a FILE* to an std::fstream, the reverse seems to be impossible, or at least not supported on C++99.

All suggestions welcome!

Update 1

Using a pipe/fdopen/setvbuf as suggested by Speed8ump and a bit of twiddling seems to work. It does no longer create files on disk nor does it consume extra memory. One step closer, except, for some reason, setvbuf is not working as expected. Manually fixing it up is possible, but of course not portable.

// create a pipe for reading, do not allocate memory
int pipefd[2];
_pipe(pipefd, 0, _O_RDONLY | _O_BINARY);

// open the read pipe for binary reading as a file
fp = _fdopen(pipefd[0], "rb");

// try to switch the buffer ptr and size to our buffer, (no buffering)
setvbuf(fp, buffer, _IONBF, bufferSize);

// for some reason, setvbuf does not set the correct ptr/sizes      
fp->_ptr = buffer;
fp->_charbuf = fp->_bufsiz = fp->_cnt = bufferSize;

Update 2

Wow. So it seems that unless I dive into the MS-specific implementation CreateNamedPipe / CreateFileMapping, POSIX portability costs us an entire memcopy (of any size!), be it to file or into a pipe. Hopefully the compiler understands that this is just a temporary and optimizes this. Hopefully.

Still, we eliminated the silly device writing intermediate. Yay!

int pipefd[2];
pipe(pipefd, bufferSize, _O_BINARY);   // setting internal buffer size

FILE* in  = fdopen(pipefd[0], "rb");    
FILE* out = fdopen(pipefd[1], "wb");   

// the actual copy
fwrite(buffer, 1, bufferSize, out);  
fclose(out);

// fread(in), fseek(in), etc.. 

fclose(in);
StarShine
  • 1,940
  • 1
  • 27
  • 45
  • For the first item, a GUID can serve as a temporary file name. – PaulMcKenzie Mar 02 '15 at 10:29
  • 1
    possible duplicate of [C - create file in memory](http://stackoverflow.com/questions/12249610/c-create-file-in-memory) – Ôrel Mar 02 '15 at 11:17
  • You're looking for a C++98 solution? Or for a C solution? C99? (Which MSVC doesn't support, but there's Cygwin and MinGW, though I'm not sure how complete their library support is.) And `tmpnam` is always a potential security risk (`tmpfile` is better). – mafso Mar 02 '15 at 11:50
  • @mafso on windows it's tmpfile_s (or tmpnam_s), is that also a security risk? thanks for the hint! – StarShine Mar 02 '15 at 15:26
  • @paulMcKenzie, how do you this working? for "rb" fopen, the file must exist.. – StarShine Mar 02 '15 at 15:27
  • @Orel: this is not a duplicate, since the goal is not to create a file in memory, rather, read from an already created buffer through the FILE* api. – StarShine Mar 02 '15 at 15:51
  • @StarShine the temporary name needs to be determined I'm referring to your first bullet point. To virtually guarantee a unique name, call a function that returns a GUID and use that as part of the name.– – PaulMcKenzie Mar 02 '15 at 17:02
  • 1
    @StarShine: `tmpnam` is unsafe because an attacker could guess the file name and generate a link to it just before the `fopen` call. Depending on the use case, this may or may not be an issue, but you have no reason to use `tmpnam` in the first place, you're not interested in a file name at all. A `tmpfile` + `fwrite` + `fseek` probably does what you want, and (I'm not sure on this, but I think) your OS maybe doesn't write anything to disk physically. `tmpfile_s` is non-portable Microsoft bs (with no enhancements), `tmpfile` is fine. – mafso Mar 02 '15 at 17:20
  • maby open argv[0]. it should exist – sp2danny Mar 02 '15 at 17:51
  • @StarShine why write it to the disk, you will be slow down by the disk speed ? This will add complexity for no gain. – Ôrel Mar 02 '15 at 22:22
  • @Orel that is the whole point of this discussion. I don't want to write a tempfile at all. I just need a FILE* ptr that works on a pre-existing buffer. However, the posix c functions expect the file to exist when you specify the read flag "r" in fopen or any of it's variants. – StarShine Mar 02 '15 at 22:39
  • So why us `CreateFile` with `FILE_ATTRIBUTE_TEMPORARY` is not good ? – Ôrel Mar 02 '15 at 22:48
  • @Orel it's Windows-only. I need something that is portable. – StarShine Mar 02 '15 at 22:49
  • You will have to implement for each system but it is quite easy in linux. For other flavor you can write to the disk as default behavior. This will be portable and optimized for linux and windows – Ôrel Mar 03 '15 at 00:09

2 Answers2

2

You might try using a pipe and fdopen, that seems to be portable, is in-memory, and you might still be able to do the setvbuf trick you are using.

Speed8ump
  • 1,307
  • 10
  • 25
  • It's a great idea but I've had it crash consequently when the file does not exist. – StarShine Mar 02 '15 at 15:46
  • 1
    I'm unclear where you are having a crash? Are you saying that the legacy code is crashing when there's not an actual file behind the FILE* object? If that's the case then that code is somehow interacting with the filesystem layers, meaning that no solution not involving the file system will work. At that point you are probably stuck doing some temp file method, fastest would probably involve a ram disk of some sort. – Speed8ump Mar 02 '15 at 17:10
  • Yes, replacing the legacy reading code with simple freads works fine, so I guess there's more complex interaction going on. I'll do some more testing to see what's breaking it, and then accept your answer after I've updated my post. – StarShine Mar 02 '15 at 23:32
  • Hmm, if I force the FILE* fp descriptor values myself, it suddenly started working! – StarShine Mar 03 '15 at 00:03
1

Your setvbuf hack is a nice idea, but not portable. C11 (n1570):

7.21.5.6 The setvbuf function

Synopsis

#include <stdio.h>
int setvbuf(FILE * restrict stream,
            char * restrict buf,
            int mode, size_t size);

Description

[...] If buf is not a null pointer, the array it points to may be used instead of a buffer allocated by the setvbuf function [...] and the argument size specifies the size of the array; otherwise, size may determine the size of a buffer allocated by the setvbuf function. The contents of the array at any time are indeterminate.

There is neither a guarantee that the provided buffer is used at all, nor about what it contains at any point after the setvbuf call until the file is closed or setvbuf is called again (POSIX doesn't give more guarantees).

The easiest portable solution, I think, is using tmpfile, fwrite the data into that file, fseek to the beginning (I'm not sure if temporary files are guaranteed to be seekable, on my Linux system, it appears they are, and I'd expect them to be elsewhere), and pass the FILE pointer to the function. This still requires copying in memory, but I guess usually no writing of the data to the disk (POSIX, unfortunately, implicitly requires a real file to exist). A file obtained by tmpfile is deleted after closing.

mafso
  • 5,433
  • 2
  • 19
  • 40
  • On my system, it does tend to write out 0 bytes sized temp files in the root dir. If I have to do a memcopy anyway, I could just as well fwrite it in a in/out pipe, and forget about (tmp) file-handling completely. It's just so silly that the spec leaves things up in the air, while it could be really useful. – StarShine Mar 03 '15 at 12:53
  • 1
    Well, the C standard is intended to be minimal. To get a somewhat more "usable" C interface, there is the POSIX standard (standardizing `fmemopen`, for example), where there is one vendor contradicting and sabotaging it wherever possible. – mafso Mar 03 '15 at 13:02