7

I need to find a way to store 250 KB of plain text numbers inside my program's executable file.

Usually, I would put the data in a separate file and let the program read it while it is running, but that's not an option here. Instead, the program and the data need to be in one executable file.

I have absolutely no idea how to do it (except writing 250.000 #defines :-) and I'd appreciate any suggestions.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
michael
  • 85
  • 2
  • 5
  • 2
    What platform? could you use Win32 resources, for example? – Rowland Shaw Apr 11 '10 at 18:39
  • 1
    Similar question (shameless plug): http://stackoverflow.com/questions/2481998/how-do-i-include-extremely-long-literals-in-c-source – Billy ONeal Apr 11 '10 at 18:47
  • 1
    For all those that suggested using an array, i thought about that too, but got discouraged by the long compile time. guess it was right after all. thank you! – michael Apr 11 '10 at 19:37
  • 1
    Just put your array in its own source file, and if you're using a sane build system, you should only have to compile it once (until you do a full rebuild). – Tyler McHenry Apr 11 '10 at 19:56
  • How can I store the data inside the executable without hard-coding it? The data I'm trying to store is input from the user. – Serket Apr 30 '20 at 21:33

11 Answers11

8

How about an array of some sort. Just put that definition in a file and compile it into your program:

int external_data[] =
{
    ...
};

you can have the compiler tell you how many elements are in external data:

size_t external_data_max_idx = sizeof(external_data) / sizeof(*external_data);
R Samuel Klatchko
  • 74,869
  • 16
  • 134
  • 187
  • +1 for suggesting the definition in an *separate* file. I'm currently using this technique and I only have to change the file and rebuild, especially when the data changes. – Thomas Matthews Apr 12 '10 at 17:26
5

You could just generate an array definition. For example, suppose you have numbers.txt:

$ head -5 numbers.txt
0.99043748698114
0.0243802034269436
0.887296518349228
0.0644020236531517
0.474582201929554

I've generated it for the example using:

$ perl -E'say rand() for (1..250_000)' >numbers.txt

Then to convert it to C array definition you could use a script:

$ perl -lpE'BEGIN{ say "double data[] = {"; }; 
>     END{ say "};" }; 
>     s/$/,/' > data.h < numbers.txt 

It produces:

$ head -5 data.h
double data[] = {
0.99043748698114,
0.0243802034269436,
0.887296518349228,
0.0644020236531517,

$ tail -5 data.h
0.697015237317363,
0.642250552146166,
0.00577098769553785,
0.249176256744811,
};

It could be used in your program as follows:

#include <stdio.h>    
#include "data.h"

int main(void) {
  // print first and last numbers
  printf("%g %g\n", data[0], data[sizeof(data)/sizeof(*data)-1]);
  return 0;
}

Run it:

$ gcc *.c && ./a.out
0.990437 0.249176
jfs
  • 399,953
  • 195
  • 994
  • 1,670
4

Store it as a const array:

/* Maximum number of digits in a number, adjust as necessary */
#define NUMBER_MAX_LENGTH 16

/* How many numbers you have (in this case 250K), adjust as necessary */
#define NUMBER_OF_NUMBERS (250 * (1 << 10))

const char data[NUMBER_OF_NUMBERS][NUMBER_MAX_LENGTH+1] =
 { "12345", "2342841", "129131", "18317", /* etc */ };

Presumably you know your data set so you can come up with the appropriate value for NUMBER_MAX_LENGTH in your case.

You can also of course write a script that transforms a flat file of numbers into this format. If you want, you could even keep the numbers in a plain-text data file and have the script generate the corresponding C code as above during your build.

I wrote it that way because you said "plain text numbers", indicating that you need them as strings for some reason. If you'd rather have them as integers, it's even simpler:

/* How many numbers you have (in this case 250K), adjust as necessary */
#define NUMBER_OF_NUMBERS (250 * (1 << 10))

const int data[NUMBER_OF_NUMBERS] =
 { 12345, 2342841, 129131, 18317, /* etc */ };

Assuming that none of your numbers is too large to store in an int.

Tyler McHenry
  • 74,820
  • 18
  • 121
  • 166
4

You can use the xxd command with the -i option to convert any file to a char vector in C. If you are on Windows you can look into using it in Cygwin.

John Carter
  • 53,924
  • 26
  • 111
  • 144
epatel
  • 45,805
  • 17
  • 110
  • 144
2

Lets assume the numbers are constants. Lets assume, that you can compute this list once, in "pre-compilation" stage. Lets assume that there is a function that can "return" that list.

Stage one: write an application that calls getFooNumber() and works perfectly. Nice.

Stage two: Take that function, and put it in another project. Now, lets write a small application that will generate the 250,000 lines of C code.

#include <stdlib>
#define MAX_BLABLA 2500000

int main(int argc, char *argv[] )
{
  FILE *f fopen("fooLookupTable.h");
  long i;
  fprintf( f, "#ifndef FOO_HEADER\n");
  fprintf( f, "#define FOO_HEADER\n");

  fprintf( f, "char [] blabla = {\n\t");
  for( i=0; i<MAX_BLABLA; i ++ )
  {
     fprintf(f, "%d", getFooNumber(i) );
     if (n+1 != MAX_BLABLA)
         fprintf(f, ",");
     if (n%10 == 0)
         fprintf(f, "\n\t");
  }
  fprintf( f, "};\n\n");
  fprintf( f, "#endif // FOO_HEADER\n");
}

This will create the list Billy ONeal talked about.

Stage 3: The use the header file you just created in stage 2, and use it inside the first project to return from the new getFooNumber() the value from the lookup table.

Stage 4: Learn to use Qt, and understand that you can embed the file directly and load it using QFile(":application/numberz.txt").

Notes: * The C code is probably broken. I did not test it. * If you are usign Windows or Mac, you can probably do something similar with the resource system (MAC has a similar thing no?)

elcuco
  • 8,948
  • 9
  • 47
  • 69
1

I agree with the previous answers. The best way is to simply store it in the code and then compile it into the program. For the sake of argument you could look at the format for an executable and add some data/code in there (This is how a lot of viruses work) and simply read from the executable and get the data. http://refspecs.freestandards.org/elf/elf.pdf has the format for an executable. Once again this is for the sake of argument and is not recommended.

Romain Hippeau
  • 24,113
  • 5
  • 60
  • 79
0

Just make a string of however many characters in your executable program, and then have another section of the program open it's self as a file, grab the bytes, find the string you have compiled and alter it however you want directly (make sure to put a unique string in there for locating the actual area with the string in binary), might need to shut the program down after executing another program which writes the data to the original program and re-executes it, when the original program is re-executed it can read the new written values from the string which was declared in it's binary and use that to perform what ever tasks.

0

It sounds like you're trying to avoid putting it in a source file, but that's exactly what I'd do:

int numbers[250000] = {1, 2, ...};

It's technically possible to keep them as a plain file and write a linker directive file that creates a new data section of the proper size and combines them, but there's really no reason. Put that definition in a separate file and #include it into the file that needs it

Michael Mrozek
  • 169,610
  • 28
  • 168
  • 175
  • 1
    Note you could just make that "numbers[]` and the compiler will count for you. – Billy ONeal Apr 11 '10 at 18:42
  • Yeah, but if I know how many things are supposed to be in the array I like to include it, both so people looking at it know the size immediately, and so if I mess up and forget/duplicate one I'll get a compile-time error – Michael Mrozek Apr 11 '10 at 19:44
0

You could adapt this solution to numbers:

static const wchar_t *systemList[] = {
    L"actskin4.ocx",
    L"advpack.dll",
    L"asuninst.exe",
    L"aswBoot.exe",
    L"AvastSS.scr",
    L"avsda.dll",
    L"bassmod.dll",
    L"browseui.dll",
    L"CanonIJ Uninstaller Information",
    L"capicom.dll",
    L"cdfview.dll",
    L"cdm.dll",
    L"d3dx9_24.dll",
    L"d3dx9_25.dll",
    L"d3dx9_27.dll",
    L"d3dx9_28.dll",
    L"d3dx9_29.dll",
    L"d3dx9_30.dll",
    L"danim.dll",
    L"dfrgntfs.exe",
    L"dhcpcsvc.dll",
    L"dllhost.exe",
    L"dnsapi.dll",
    L"drivers\\aavmker4.sys",
    L"drivers\\apt.sys",
    L"drivers\\aswFsBlk.sys",
    L"drivers\\aswmon.sys",
    L"drivers\\aswmon2.sys",
    L"drivers\\aswRdr.sys",
    L"drivers\\aswSP.sys",
    L"drivers\\aswTdi.sys",
    L"drivers\\avg7core.sys",
    L"drivers\\avg7rsw.sys",
    L"drivers\\avg7rsxp.sys",
    L"drivers\\avgclean.sys",
    L"drivers\\avgmfx86.sys",
    L"drivers\\avgntdd.sys",
    L"drivers\\avgntmgr.sys",
    L"drivers\\avgtdi.sys",
    L"drivers\\avipbb.sys",
    L"drivers\\cmdmon.sys",
    L"drivers\\gmer.sys",
    L"drivers\\inspect.sys",
    L"drivers\\klick.sys",
    L"drivers\\klif.sys",
    L"drivers\\klin.sys",
    L"drivers\\pxcom.sys",
    L"drivers\\pxemu.sys",
    L"drivers\\pxfsf.sys",
    L"drivers\\pxrd.sys",
    L"drivers\\pxscrmbl.sys",
    L"drivers\\pxtdi.sys",
    L"drivers\\rrspy.sys",
    L"drivers\\rrspy64.sys",
    L"drivers\\ssmdrv.sys",
    L"drivers\\UMDF",
    L"drivers\\USBSTOR.SYS",
    L"DRVSTORE",
    L"dxtmsft.dll",
    L"dxtrans.dll",
    L"en-us",
    L"extmgr.dll",
    L"fntcache.dat",
    L"hal.dll",
    L"icardie.dll",
    L"ie4uinit.exe",
    L"ieakeng.dll",
    L"ieaksie.dll",
    L"ieakui.dll",
    L"ieapfltr.dat",
    L"ieapfltr.dll",
    L"iedkcs32.dll",
    L"ieframe.dll",
    L"iepeers.dll",
    L"iernonce.dll",
    L"iertutil.dll",
    L"ieudinit.exe",
    L"ieui.dll",
    L"imon1.dat",
    L"inseng.dll",
    L"iphlpapi.dll",
    L"java.exe",
    L"javaw.exe",
    L"javaws.exe",
    L"jgdw400.dll",
    L"jgpl400.dll",
    L"jscript.dll",
    L"jsproxy.dll",
    L"kbdaze.dll",
    L"kbdblr.dll",
    L"kbdbu.dll",
    L"kbdkaz.dll",
    L"kbdru.dll",
    L"kbdru1.dll",
    L"kbdtat.dll",
    L"kbdur.dll",
    L"kbduzb.dll",
    L"kbdycc.dll",
    L"kernel32.dll",
    L"legitcheckcontrol.dll",
    L"libeay32_0.9.6l.dll",
    L"Macromed",
    L"mapi32.dll",
    L"mrt.exe",
    L"msfeeds.dll",
    L"msfeedsbs.dll",
    L"msfeedssync.exe",
    L"msftedit.dll",
    L"mshtml.dll",
    L"mshtmled.dll",
    L"msrating.dll",
    L"mstime.dll",
    L"netapi32.dll",
    L"occache.dll",
    L"perfc009.dat",
    L"perfh009.dat",
    L"pncrt.dll",
    L"pndx5016.dll",
    L"pndx5032.dll",
    L"pngfilt.dll",
    L"px.dll",
    L"pxcpya64.exe",
    L"pxdrv.dll",
    L"pxhpinst.exe",
    L"pxinsa64.exe",
    L"pxinst.dll",
    L"pxmas.dll",
    L"pxsfs.dll",
    L"pxwave.dll",
    L"rasadhlp.dll",
    L"rasmans.dll",
    L"riched20.dll",
    L"rmoc3260.dll",
    L"rrsec.dll",
    L"rrsec2k.exe",
    L"shdocvw.dll",
    L"shell32.dll",
    L"shlwapi.dll",
    L"shsvcs.dll",
    L"sp2res.dll",
    L"spmsg.dll",
    L"ssiefr.EXE",
    L"STKIT432.DLL",
    L"streamhlp.dll",
    L"SWSC.exe",
    L"tzchange.exe",
    L"url.dll",
    L"urlmon.dll",
    L"vsdata.dll",
    L"vsdatant.sys",
    L"vsinit.dll",
    L"vsmonapi.dll",
    L"vspubapi.dll",
    L"vsregexp.dll",
    L"vsutil.dll",
    L"vswmi.dll",
    L"vsxml.dll",
    L"vxblock.dll",
    L"webcheck.dll",
    L"WgaLogon.dll",
    L"wgatray.exe",
    L"wiaservc.dll",
    L"windowspowershell",
    L"winfxdocobj.exe",
    L"wmp.dll",
    L"wmvcore.dll",
    L"WREGS.EXE",
    L"WRLogonNtf.dll",
    L"wrlzma.dll",
    L"wuapi.dll",
    L"wuauclt.exe",
    L"wuaueng.dll",
    L"wucltui.dll",
    L"wups.dll",
    L"wups2.dll",
    L"wuweb.dll",
    L"x3daudio1_0.dll",
    L"xactengine2_0.dll",
    L"xactengine2_1.dll",
    L"xactengine2_2.dll",
    L"xinput1_1.dll",
    L"xinput9_1_0.dll",
    L"xmllite.dll",
    L"xpsp3res.dll",
    L"zlcomm.dll",
    L"zlcommdb.dll",
    L"ZPORT4AS.dll"
};
Community
  • 1
  • 1
Billy ONeal
  • 104,103
  • 58
  • 317
  • 552
0

What platform are you running at? If you are on Windows and the numbers won't change in time, then just put your text file to program resources using resource linker, and read it in your code.

Eugene Mayevski 'Callback
  • 45,135
  • 8
  • 71
  • 121
0

Not the solution (this was given before) but: don't put it in a header file. Write a header, which defines a function that returns an array. Then implement this in a .c file. Otherwise, you will end up in a compilation mess...

Markus Pilman
  • 3,104
  • 3
  • 22
  • 31