0

We need to store some data as c++ header file, so that we can then include it in the build bundle and shipped with any applications that use it.

To do that we use

xxd -i data.png > data.h

This works well, but the data.h files is now as 6X large as the data.png file. That means, if the data.png is 4MB, the data.h would be 24MB.

May I ask if there is a way to compress the data.h file to a smaller size?

Thanks!

--- update ---

Thank you all for the suggestions! I think I could clarify the need here to provide more context!

  1. the ideal way for us to consume the file is we can open it as input stream like
std::ifstream is;
infile.open("data.png");
somefunc(is) // a api function that takes std::istream as input

p.s. the file is not png file but a scripted model, I use png as example because I find it as a more generic problem of "xxd -i"

  1. we didn't find a way to make it available as a file to be read, as the file system the codes actually searching would be in Android/iOS. (only files on the mobile system are available and the source codes would be zipped in the .so file)

  2. with the header file we can do something like

std::stringstream is;
is.write((char*)data_byte_array, data_byte_array_len)
somefunc(is)

The source codes would end up built as a lib.so. In our tests, A 70KB data.h would end up adding 45KB to the lib.so.

  • Are you concerned about the size of the executable, or specifically about the size of your source code files? Generated source code can often be large, but is it causing a measurable problem? – Drew Dormann Feb 23 '22 at 21:18
  • 2
    Why in the world would you put that into a header file? I could imagine it being exported from a binary or even included as a file to map it into memory. In any case, yes, you can probably put a compressed representation in there. Just gzip it or something like that. – Ulrich Eckhardt Feb 23 '22 at 21:18
  • Have you tried `gzip data.h`? Or perhaps just ship `data.png` and let your users run `xxd` themselves. – n. m. could be an AI Feb 23 '22 at 21:21
  • 1
    In the embedded world, a common function is to place the data into a source file, not a header file. Thus if the data doesn't change, the file will only be compiled once. The header may contain an `extern` reference to the data array. – Thomas Matthews Feb 23 '22 at 21:22
  • With base64 you'll have just a 33% increase in size, I sometimes use that when I want to store some binary in my sources. Encoding and decoding routines are quite trivial. – MatG Feb 23 '22 at 21:34

2 Answers2

0

May I ask if there is a way to compress the data.h file to a smaller size?

You can use any lossless compression algorithm. The gzip program is a common default choice on POSIX systems.

eerorika
  • 232,697
  • 12
  • 197
  • 326
  • Thanks for the answer! This won't work because gzip is actually zipping back the file, you may think the data.png is zipped already (the data.png is actually a binary model file) -- also see my updated questions – Yinglao Liu Feb 23 '22 at 23:23
0

You can compile any binary file into an object file using objcopy -I binary. See C/C++ with GCC: Statically add resource files to executable/library for more details

doron
  • 27,972
  • 12
  • 65
  • 103
  • That's new to me, is that available on Mac? I tried brew install but couldn't find lol – Yinglao Liu Feb 23 '22 at 23:43
  • Should be. objcopy is part of gnu binutils package that normally comes with gcc. – doron Feb 24 '22 at 08:28
  • This approach sounds interesting! In mac, we can install a tool called gobjcopy which is said to be mac version of objcopy and the target data.o is of similar size! But may I ask is there an easy way to copy the data.o to the build for use (I am using CMake)? and how can we import the data in our codes – Yinglao Liu Feb 25 '22 at 23:40