2
#include <stdio.h>
#include <stdlib.h>


int main(void)
{
    int *int_pointer = (int *) malloc(sizeof(int));

    // open output file
    FILE *outptr = fopen("test_output", "w");
    if (outptr == NULL)
    {
        fprintf(stderr, "Could not create %s.\n", "test_output");
        return 1;
    }

    *int_pointer = 0xabcdef;

    fwrite(int_pointer, sizeof(int), 1, outptr);

    //clean up
    fclose(outptr);
    free(int_pointer);

    return 0;
}

this is my code and when I see the test_output file with xxd it gives following output.

$ xxd -c 12 -g 3 test_output 
0000000: efcdab 00                    ....

I'm expecting it to print abcdef instead of efcdab.

DYZ
  • 55,249
  • 10
  • 64
  • 93
Maulik
  • 2,881
  • 1
  • 22
  • 27
  • 1
    The problem is caused by the byte order conversion. You may want to read more about it here: https://www.gnu.org/software/libc/manual/html_node/Byte-Order.html In brief: The bytes of an integer number are not stored in memory in the same order in which they are written in the program. – DYZ Jan 30 '17 at 05:28
  • 2
    Read about endianness, e.g. https://en.wikipedia.org/wiki/Endianness – Support Ukraine Jan 30 '17 at 05:30

2 Answers2

2

Which book are you reading? There are a number of issues in this code, casting the return value of malloc for example... Most importantly, consider the cons of using an integer type which might vary in size and representation from system to system.

  • An int is guaranteed the ability to store values between the range of -32767 and 32767. Your implementation might allow more values, but to be portable and friendly with people using ancient compilers such as Turbo C (there are a lot of them), you shouldn't use int to store values larger than 32767 (0x7fff) such as 0xabcdef. When such out-of-range conversions are performed, the result is implementation-defined; it could involve saturation, wrapping, trap representations or raising a signal corresponding to computational error, for example, the latter of two which could cause undefined behaviour later on.
  • You need to translate to an agreed-upon field format. When sending data over the write, or writing data to a file to be transferred to other systems, it's important that the protocol for communication be agreed upon. This includes using the same size and representation for integer fields. Both output and input should be followed by a translation function (serialisation and deserialisation, respectively).
  • Your fields are binary, and so your file should be opened in binary mode. For example, use fopen(..., "wb") rather than "w". In some situations, '\n' characters might be translated to pairs of \r\n characters, otherwise; Windows systems are notorious for this. Can you imagine what kind of havoc and confusion this could wreak? I can, because I've answered a question about this problem.

Perhaps uint32_t might be a better choice, but I'd choose unsigned long as uint32_t isn't guaranteed to exist. On that note, for systems which don't have htonl (which returns uint32_t according to POSIX), that function could be implemented like so:

uint32_t htonl(uint32_t x) {
    return (x & 0x000000ff) << 24
         | (x & 0x0000ff00) << 8
         | (x & 0x00ff0000) >> 8
         | (x & 0xff000000) >> 24;
}

As an example inspired by the above htonl function, consider these macros:

typedef unsigned long ulong;
#define serialised_long(x)   serialised_ulong((ulong) x)
#define serialised_ulong(x)    (x & 0xFF000000) / 0x1000000 \
                             , (x & 0xFF0000)   / 0x10000   \
                             , (x & 0xFF00)     / 0x100     \
                             , (x & 0xFF)

typedef unsigned char uchar;
#define deserialised_long(x) (x[3] <= 0x7f \
                                    ? deserialised_ulong(x) \
                                    : -(long)deserialised_ulong((uchar[]) { 0x100  - x[0] \
                                                                          , 0xFF   - x[1] \
                                                                          , 0xFF   - x[2] \
                                                                          , 0xFF   - x[3] })
#define deserialised_ulong(x) ( x[0] * 0x1000000UL \
                              + x[1] * 0x10000UL   \
                              + x[2] * 0x100UL     \
                              + x[3]               )

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    FILE *f = fopen("test_output", "wb+");
    if (f == NULL)
    {
        fprintf(stderr, "Could not create %s.\n", "test_output");
        return 1;
    }
    
    ulong value = 0xABCDEF;
    unsigned char datagram[] = { serialised_ulong(value) };
    fwrite(datagram, sizeof datagram, 1, f);
    printf("%08lX serialised to %02X%02X%02X%02X\n", value, datagram[0], datagram[1], datagram[2], datagram[3]);
    
    rewind(f);
    
    fread(datagram, sizeof datagram, 1, f);
    value = deserialised_ulong(datagram);
    printf("%02X%02X%02X%02X deserialised to %08lX\n", datagram[0], datagram[1], datagram[2], datagram[3], value);
    
    fclose(f);
    return 0;
}
Braden Best
  • 8,830
  • 3
  • 31
  • 43
autistic
  • 1
  • 3
  • 35
  • 80
1

Use htonl()

It converts from whatever the host-byte-order is (endianness of your machine) to network byte order. So whatever machine you're running on you will get the the same byte order. These calls are used so that regardless of the host you're running on the bytes are sent over the network in the right order, but it works for you too.

See the man pages of htonl and byteorder. There are various conversion functions available, also for different integer sizes, 16-bit, 32-bit, 64-bit ...

#include <stdio.h>
#include <stdlib.h>
#include <arpa/inet.h>

int main(void) {
    int *int_pointer = (int *) malloc(sizeof(int));

    // open output file
    FILE *outptr = fopen("test_output", "w");
    if (outptr == NULL) {
        fprintf(stderr, "Could not create %s.\n", "test_output");
        return 1;
    }

    *int_pointer = htonl(0xabcdef);  // <====== This ensures correct byte order

    fwrite(int_pointer, sizeof(int), 1, outptr);

    //clean up
    fclose(outptr);
    free(int_pointer);

    return 0;
}
clearlight
  • 12,255
  • 11
  • 57
  • 75
  • *The htonl() and htons() functions shall return the argument value converted from host to network byte order.* Do you understand what this means for systems that use *network byte order* internally? Also, I strongly recommend [POSIX manpages](http://pubs.opengroup.org/onlinepubs/9699919799/functions/htonl.html) rather than Linux manpages; Linux has no say in what C is or isn't. – autistic Jan 30 '17 at 06:33
  • @Seb - What am I missing? http://stackoverflow.com/questions/32205546/confused-with-network-byte-order-and-host-byte-order – clearlight Jan 30 '17 at 06:40
  • The `htonl` contract is that the argument is in *host byte order*, and that the return value is in *network byte order*. This is regardless of internal representation. Consider the functions `strto*` for example, which are contracted such that the argument *is a string*, and the return value is an integer represented by that string (or `0`). Once again, this is regardless of internal representation. There are no contractual obligations within the POSIX manpages for `hton*` or `ntoh*` that state translation from *internal representation* to host or network byte order... – autistic Jan 30 '17 at 07:07
  • Your mistake is confusing *host byte order* (which is absolute) with *internal representation* (which is relative, varies from system to system)... The manpages don't mention internal representation at all! – autistic Jan 30 '17 at 07:08
  • His @Seb, I don't understand the implications of all the contracts, and if there are cases it doesn't work, but it has been my understanding that htonl() does the right thing. https://www.tutorialspoint.com/unix_sockets/network_byte_orders.htm – clearlight Jan 31 '17 at 07:03
  • and according to [the C standard](http://port70.net/~nsz/c/c11/n1570.html), what is "the right thing"? – autistic Feb 01 '17 at 06:17
  • @Seb - Everything I've read said that whatever host byte order is, htonl() converts produces the right network byte order, by either swapping or being left as a NOP. I don't know in which cases C would not conform to the system architecture. It might take me awhile to read every spec and make this my thesis, so if you have a specific reference or solution I'd love to hear it because AFAICT htonl() is a pretty common way to resolve the endianness problem. If you're suggesting the problem is unsolvable I'm all ears. – clearlight Feb 01 '17 at 06:22
  • Why not take the opportunity to teach about *all* host-agnostic operations, not just serialisation of 16- and 32-bit values? For example, you seem to be describing `htonl` as `uin32_t htonl(uint32_t x) { return (x & 0xFF) * 0x1000000 + (x & 0xFF00) * 0x100 + (x & 0xFF0000) / 0x100 + (x & 0xFF000000) / 0x1000000; }`. All of these operations, which together make up the serialisation, are well defined by the C standard, and in such a way that this function produces the same ("*network byte order*", A.K.A. little endian) result. The function isn't well defined without this. – autistic Feb 01 '17 at 07:24
  • @Seb - I think htonl() will do what the question-asker wants in the vast majority of cases, and the C byte/word/long swapping exercise I'm also familiar with. I suppose a test could be written where after a test value is assigned one could determine the endianness, but your comments show the risks of the current solution and things to be aware of so I'm satisfied that the answer now covers the important elements and it can be a project for the questioner, perusers to investigate if they want the ultimate perfect solution. – clearlight Feb 01 '17 at 07:31
  • @autistic "Linux has no say in what C is or isn't." Technically, neither does POSIX. If you compile with `-pedantic`, the compiler *should* complain about POSIX functions. And I'm not sure if that was a typo, but network byte order is BE, not LE. – Braden Best Jan 31 '22 at 06:25
  • 1
    @Braden Best that logic only works up until you realise `htonl` is defined by POSIX, and then your Linux implementation intends to implement that. Otherwise, we're going to have to have a philosophical discussion about portability. I suppose your compiler is intended to be at least compliant enough to work on your system, your system is going to be POSIX-compliant to a degree, and these macros/functions have been in POSIX C for ages. OP could easily be running FreeBSD or MacOSX, where the Linux manpages are irrelevant, so it's best to link to the POSIX manpages. – autistic Jan 31 '22 at 11:43