0

I have a C program last compiled in 1990, that reads and writes some binary files. The executable still works, reading and writing them perfectly. I need to recompile the source, add some features, and then use the code, reading in some of the old data, and outputting it with additional information.

When I recompile the code, with no changes, and execute it, it fails reading in the old files, giving segmentation faults when I try to process the data read into an area of memory. I believe that the problem may be that the binary files written earlier used 4 8-bit byte integers, 8 byte longs, and 4 byte floats. The architecture on my machine now uses 64-bit words instead of 32. Thus when I extract an integer from the data read in, it is aligned incorrectly and sets an array index that is out of range for the program space.

On the Mac OS X 10.12.6, using its C compiler which might be:

Apple LLVM version 8.0.0 (clang-800.0.33.1)
Target: x86_64-apple-darwin16.7.0

Is there a compiler switch that would set the compiled lengths of integers and floats to the above values? If not, how do I approach getting the code to correctly read the data?

chqrlie
  • 131,814
  • 10
  • 121
  • 189
John Wooten
  • 685
  • 1
  • 6
  • 21
  • Change your code to use fixed width types. `int32_t` and such. https://stackoverflow.com/questions/14515874/difference-between-int32-int-int32-t-int8-and-int8-t – Retired Ninja Aug 19 '17 at 14:47
  • 2
    There probably weren't any 8-byte longs back in 1990. Not on architectures that are still in widespread use today at any rate. OTOH 8 byte longs is the norm today. What is your OS and architecture? You may want to further research and validate the binary format used by the program. – n. m. could be an AI Aug 19 '17 at 14:48
  • 2
    this question intrigues me... any idea what OS and compiler was used to compile it back then? And what compiler switches were used? – galdin Aug 19 '17 at 14:53
  • If you’re getting segfaults, another assumption to look out for is what integral type is the same width as a pointer. In addition to fixed-width types, portable code ought to use `uintptr_t`, `size_t`, `ptrdiff_t` and `off_t` where appropriate. As an alternative to making thousands of those changes, are you able to recompile as a 32-bit executable? – Davislor Aug 19 '17 at 16:24
  • How / where do you have that executable?! – Antti Haapala -- Слава Україні Aug 19 '17 at 20:27
  • I believe it was compiled on an early NeXT using a 68030 CPU. Not sure so am checking this out. Executable was saved in a CVS repository with all other development codes around that same time. The repository has been moved from machine to machine using tar files and now resides on a Mac mini I use as a server. The data files were also in CVS. The points made here about pointer lengths, etc are all worth looking into. I've noticed many complaints about the code, I.e. Warning, that weren't present when originally compiled. – John Wooten Aug 20 '17 at 11:42

1 Answers1

8

Welcome to the world of portability headaches!

If your program was compiled in 1990, there is a good chance it uses 4 byte longs, and it is even possible that it use 2 byte int, depending on the architecture it was compiled for.

The size of basic C types is heavily system dependent, among a number of more subtle portability issues. long is now 64-bit on both 64-bit linux and 64-bit OS/X, but still 32-bit on Windows (for both 32-bit and 64-bit versions!).

Reading binary files, you must also deal with endianness, that changed from big-endian in 1990 MacOS to little-endian on today's OS/X, but still big-endian on other systems.

To make matters worse, the C language evolved over this long period and some non trivial semantic changes occurred between pre-ANSI C and Standard C. Some old syntaxes are no longer supported either...

There is no magic flag to address these issues, you will need to dive into the C code and understand what is does and try and modernize the code and make it more portable, independent on the target architecture. You can use the fixed width types from <stdint.h> to ease this process (int32_t, ...).

People answering C questions on Stackoverflow are usually careful to post portable code that works correctly for all target architectures, even some purposely vicious ones such as the DS9K (a ficticious computer that does everything in correct but unexpected ways).

chqrlie
  • 131,814
  • 10
  • 121
  • 189