Converting little endian to big endian using Bitshift Operators

Question

I am working on endianess. My little endian program works, and gives the correct output. But I am not able to get my way around big endian. Below is the what I have so far. I know i have to use bit shift and i dont think i am doing a good job at it. I tried asking my TA's and prof but they are not much help. I have been following this link (convert big endian to little endian in C [without using provided func]) to understand more but cannot still make it work. Thank you for the help.

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
    FILE* input;
    FILE* output;

    input = fopen(argv[1],"r");
    output = fopen(argv[2],"w");
    int value,value2;
    int i;
    int zipcode, population;
    while(fscanf(input,"%d %d\n",&zipcode, &population)!= EOF)
    {
        for(i = 0; i<4; i++)
        {
        population = ((population >> 4)|(population << 4));
        }
        fwrite(&population, sizeof(int), 1, output);
    }

    fclose(input);      
    fclose(output);

    return 0;
}

Do you know what your word size ("bytes per integer") is? PS: "asking classmates is not allowed", but asking the Internet is? :) — Kijewski, Mar 27 '13 at 01:02
@Kay the TA said to make an account on 'stackoverflow' so i guess yea ? — Mani, Mar 27 '13 at 01:04
Hm, doing bitshifts like those on signed integers is a bad idea... — Matteo Italia, Mar 27 '13 at 01:05
@MatteoItalia , i tried using unsigned char/int but still could not get the correct output. — Mani, Mar 27 '13 at 01:07
Your program is nonsensical, as it reads two numbers in ASCII, then reverses the nibbles (not bytes, as would be the case for changing endianness) of one of them and writes it out in binary. You should specify exactly what problem you are trying to solve, what your input looks like, and what output is desired. — Jim Balter, Mar 27 '13 at 01:08
"doing bitshifts like those on signed integers is a bad idea..." -- it's only a problem for negative values, which population presumably isn't. But that's the least of the OP's problems. — Jim Balter, Mar 27 '13 at 01:10
"could not get the correct output" -- That's because you're just stabbing blindly at it, rather than specifying and thinking through what transformation is needed. — Jim Balter, Mar 27 '13 at 01:12
@JimBalter it'd likely be a problem if any of the bits had the topmost bit set. — Kijewski, Mar 27 '13 at 01:12
@JimBalter I will try to be more clear. We are given a input files (46804 3450103 37215 1337 47906 46849). We have to convert the (3450103,1337,46849) to binary dump and write it to an output file. When converting to little endian. — Mani, Mar 27 '13 at 01:12
Ok, given that your program isn't quite so nonsensical. So you need to convert from ASCII numbers to little-endian binary. The first thing for you to do is understand what little-endian binary *is* ... there's no hint of anything little-endian in your code. — Jim Balter, Mar 27 '13 at 01:15
@JimBalter but i do get the rite output using the fscanf and fwrite. — Mani, Mar 27 '13 at 01:17
quick question, since this will help me solve all my problems, how can i see the binary values??? how can i print out the binary value using the above code? — Mani, Mar 27 '13 at 01:19
"but i do get the rite output using the fscanf and fwrite" -- Only when running on a little-endian machine. Read what I wrote again: "there's no hint of anything little-endian in your code". Don't you think that might be a problem? Your grader will. — Jim Balter, Mar 27 '13 at 01:20
@JimBalter , so what would you suggest me doing ?? How can i make sure it runs on both big and little endian machines? — Mani, Mar 27 '13 at 01:22
"this will help me solve all my problems" -- No, it won't. Your problem is that you don't understand what "little-endian" means and what implications it has for what you need to do. — Jim Balter, Mar 27 '13 at 01:22
@JimBalter "assumed that I'm stupid or inept", I'm sorry if you understood my reply like that. I did not think either … — Kijewski, Mar 27 '13 at 01:22
" How can i make sure it runs on both big and little endian machines" -- by outputing the lowest order byte first. `fwrite` is the wrong tool for this. — Jim Balter, Mar 27 '13 at 01:23
@JimBalter, our teacher asked us to use fwrite. So i ddn know if i could use anything better. — Mani, Mar 27 '13 at 01:25
@JimBalter dude, you might want to calm down a bit … I'm out. — Kijewski, Mar 27 '13 at 01:33
"our teacher asked us to use fwrite" --You should include all your requirements in your problem statement. — Jim Balter, Mar 27 '13 at 01:41
@JimBalter: Endianness is order of digits. In your mind, the digits are bytes that can have values between 0 to 255. However, this isn't always the case. Some machines have 16-bit bytes, and the endianness will revolve around those bytes, instead. If the student is working under the guise that the digits aren't 8-bit bytes, so what? Just asking... — autistic, Mar 27 '13 at 02:04
@modifiablelvalue "In your mind, the digits are bytes that can have values between 0 to 255" -- completely wrong. First, there is no correspondence between digits and bytes. Second the only place I mentioned the size of a byte is in the first sentence of my answer, where I said we can *assume* it here (as others have done). The reason we can is because when talking about little-endian *file formats* one invariably uses 8-bit bytes, *regardless of the machine representation*. — Jim Balter, Mar 27 '13 at 02:17
@JimBalter Did endianness exist when 8-bit bytes didn't exist? — autistic, Mar 27 '13 at 02:24
@modifiablelvalue Actually, no. Back then, memory was organized in words, not bytes. It wasn't until the advent of the PDP-11, with its little-endian order as opposed to the IBM 360's big-endian order that the term came into use. In any case, your question is a non sequitur that has nothing to do with what I wrote. BTW, are you the one who downvoted my answer? — Jim Balter, Mar 27 '13 at 02:32
"Endianness is order of digits. " -- I missed this. It is completely and utterly wrong. Endianness has nothing to do with digits unless you're dealing with binary coded decimal, such as the IBM 1620 or 1401. — Jim Balter, Mar 27 '13 at 02:36
@JimBalter Very well. Did the order of transmission of bytes exist when bytes weren't 8 bits? — autistic, Mar 27 '13 at 02:41
"working under the guise" -- "I do not think that word means what you think it means". — Jim Balter, Mar 27 '13 at 02:46
@modifiablelvalue I worked on the ARPANET back in 1969, when the first computer-to-computer transmission occurred. We had 8 bite bytes. — Jim Balter, Mar 27 '13 at 02:53
@Kay This old man, he played one. He played knick-knack on my thumb; With a knick-knack paddywhack, give the dog a bone, this old man came rolling home. — autistic, Mar 27 '13 at 03:09

score 7 · Accepted Answer · answered Mar 27 '13 at 01:07

I'm answering not to give you the answer but to help you solve it yourself.

First ask yourself this: how many bits are in a byte? (hint: 8) Next, how many bytes are in an int? (hint: probably 4) Picture this 32-bit integer in memory:

  +--------+
0x|12345678|
  +--------+

Now picture it on a little-endian machine, byte-wise. It would look like this:

  +--+--+--+--+
0x|78|56|34|12|
  +--+--+--+--+

What shift operations are required to get the bytes into the correct spot?

Remember, when you use a bitwise operator like >>, you are operating on bits. So 1 << 24 would be the integer value 1 converted into the processor's opposite endianness.

Jim Balter · Answer 2 · 2013-03-27T01:51:44.220

2

"little-endian" and "big-endian" refer to the order of bytes (we can assume 8 bits here) in a binary representation. When referring to machines, it's about the order of the bytes in memory: on big-endian machines, the address of an int will point to its highest-order byte, while on a little-endian machine the address of an int will refer to its lowest-order byte.

When referring to binary files (or pipes or transmission protocols etc.), however, it refers to the order of the bytes in the file: a "little-endian representation" will have the lowest-order byte first and the highest-order byte last.

How does one obtain the lowest-order byte of an int? That's the low 8 bits, so it's (n & 0xFF) (or ((n >> 0) & 0xFF), the usefulness of which you will see below).

The next lowest-order byte is ((n >> 8) & 0xFF). The next lowest-order byte is ((n >> 16) & 0xFF) ... or (((n >> 8) >> 8) & 0xFF). And so on.

So you can peal off bytes from n in a loop and output them one byte at a time ... you can use fwrite for that but it's simpler just to use putchar or putc.

You say that your teacher requires you to use fwrite. There are two ways to do that: 1) use fwrite(&n, 1, 1, filePtr) in a loop as described above. 2) Use the loop to reorder your int value by storing the bytes in the desired order in a char array rather than outputting them, then use fwrite to write it out. The latter is probably what your teacher has in mind.

Note that, if you just use fwrite to output your int it will work ... if you're running on a little-endian machine, where the bytes of the int are already stored in the right order. But the bytes will be backwards if running on a big-endian machine.

edited Mar 27 '13 at 01:51

answered Mar 27 '13 at 01:35

Jim Balter

16,163
3
43
66

@Mani Note that I made an edit with an important correction. If you reorder the bytes in memory and then use `fwrite` to output them all at once, you must store them in an array of `char`s ... the first byte of which will go out first. – Jim Balter Mar 27 '13 at 01:53
How do you detect mixed-endian systems? – autistic Mar 27 '13 at 02:05
@modifiablelvalue The endianness of the *system* isn't relevant here ... right-shifting bytes of an `int` gets the right bytes in the right order regardless of how the `int` is stored in memory. – Jim Balter Mar 27 '13 at 02:12
@JimBalter Actually, this part of your answer makes "the endianness of the system" relevant here: "... When referring to machines, it's about the order of the bytes in memory: on big-endian machines, the address of an int will point to its highest-order byte, while on a little-endian machine the address of an int will refer to its lowest-order byte. ..." Now, back to my question... – autistic Mar 27 '13 at 02:44
@modifiablelvalue "I'll down-vote now" -- You've given no reason for it. You asked a question about mixed-endian systems that I responded to -- a question that isn't relevant because the task is not to detect the endianness of a system, just to produce the correct output regardless of the system. – Jim Balter Mar 27 '13 at 03:01
@modifiablelvalue Yes, the OP's code reversed the nibbles (4 bit segments) of bytes -- because the OP *didn't know what s/he was doing*. This is not relevant to anything, and is not a reason to downvote my answer. – Jim Balter Mar 27 '13 at 03:03
@JimBalter ... and you assumed that a `char` (a byte) is exactly 8 bits in length. How misleading, hmmm! – autistic Mar 27 '13 at 03:23
I did not poo-poo the question, I answered it, for which I was thanked and the OP upvoted and accepted my answer. My comment about the OP's code (not question) being nonsensical is correct because it did not relate to the question asked. When the OP clarified the requirements I commented that made it less nonsensical ... but swapping nibbles of bytes is still not converting to little-endianness. In any case, even I did "poo-poo" something (which I did not), that's not a valid reason for a downvote. – Jim Balter Mar 27 '13 at 03:24
I did not assume that a char is exactly 8 bits in length. I said that *we can assume* 8 bit bytes ... for the reason I have repeatedly given: that little-endian *file formats* invariably are in terms of 8 bit bytes. This has nothing to do with the architecture of the host machine, as I have noted repeatedly. – Jim Balter Mar 27 '13 at 03:26

score 1 · Answer 3 · edited May 23 '17 at 12:11

The problem with most answers to this question is portability. I've provided a portable answer here, but this recieved relatively little positive feedback. Note that C defines undefined behavior as: behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements.

The answer I'll give here won't assume that int is 16 bits in width; It'll give you an idea of how to represent "larger int" values. It's the same concept, but uses a dynamic loop rather than two fputcs.

Declare an array of sizeof int unsigned chars: unsigned char big_endian[sizeof int];

Separate the sign and the absolute value.

int sign = value < 0;
value = sign ? -value : value;

Loop from sizeof int to 0, writing the least significant bytes:

size_t foo = sizeof int;
do {
    big_endian[--foo] = value % (UCHAR_MAX + 1);
    value /= (UCHAR_MAX + 1);
} while (foo > 0);

Now insert the sign: foo[0] |= sign << (CHAR_BIT - 1);

Simple, yeh? Little endian is equally simple. Just reverse the order of the loop to go from 0 to sizeof int, instead of from sizeof int to 0:

size_t foo = 0;
do {
    big_endian[foo++] = value % (UCHAR_MAX + 1);
    value /= (UCHAR_MAX + 1);
} while (foo < sizeof int);

The portable methods make more sense, because they're well defined.

"The problem with most answers to this question is portability." -- You actually have it backwards. The results of your code depends on the value of UCHAR_MAX and CHAR_BIT, but *file formats* are defined independently of the host machine, and almost invariably in terms of 8-bit bytes. The OP's teacher did not define exactly what file format is wanted, but it almost certainly isn't the big-endian host-specific sign-and-magnitude format your code produces. — Jim Balter, Mar 27 '13 at 02:44
Justify your comment: "Your program is nonsensical, as it reads two numbers in ASCII, then reverses the nibbles (not bytes, as would be the case for changing endianness) of one of them and writes it out in binary." — autistic, Mar 27 '13 at 02:59

Converting little endian to big endian using Bitshift Operators

3 Answers3