0

I have an array of 256 unsigned integers called frequencies[256] (one integer for each ascii value). My goal is to read through an input and for each character i increment the integer in the array that corresponds to it (for example the character 'A' will cause the frequencies[65] integer to increase by one) and when the input is over I must output each integer as 4 characters in little endian form.

So far I have made a loop that goes through the input and increases each corresponding integer in the array. But i am very confused on how to output each integer in little endian form. I understand that each byte of the four bytes of each integer should be output as a character (for instance the unsigned integer 1 in little endian is "00000001 00000000 00000000 00000000" which i would want to output as the 4 ascii characters that correspond to those bytes).

But how do i get at the binary representation of an unsigned integer in my code and how would i go about chopping it up and rearranging it?

Thanks for the help.

user0123
  • 259
  • 1
  • 6
  • 17

4 Answers4

3

For hardware portability, please use the following solution:

int freqs[256];
for (int i = 0; i < 256; ++i)
    printf("%02x %02x %02x %02x\n", (freqs[i] >> 0 ) & 0xFF
                                  , (freqs[i] >> 8 ) & 0xFF
                                  , (freqs[i] >> 16) & 0xFF
                                  , (freqs[i] >> 24) & 0xFF);
tohava
  • 5,344
  • 1
  • 25
  • 47
  • I don't know what the hardware is, the program is meant to operate the same on all computers. Thanks for the response, i was wondering if you could explain your second code block a little more. Also I am using c++ not c, is there a way to accomplish the same task with cout? – user0123 Jul 28 '13 at 21:25
  • The second code uses bitwise operations in order to extract individual bytes from each value, in little endian order (least signifcant byte to most significant byte). Here is the print line in C++: `cout << setfill('0') << setw(2) << hex << ((freqs[i]>>0) & 0xFF) << ((freqs[i]>>8) & 0xFF) << ((freqs[i]>>16) & 0xFF) << ((freqs[i]>>24) & 0xFF)` – tohava Jul 28 '13 at 21:28
  • What is the purpose of 'and'ing the byte with 0xFF? Does that ensure that it is outputting only 8 bits because cout-ing freqs[i]>>n would still put out a full 32 bit int right? – user0123 Jul 28 '13 at 21:31
  • You are correct, 0xFF is to ensure we only output 1 byte. `freqs[i] >> 8` for example might print 24 bits (since we only shifted 8 out of the 32 bits of `freqs[i]`) – tohava Jul 28 '13 at 21:32
  • 1
    I implemented your code in a for loop and it seems to be printing out integers (i am assuming it is printing the integer conversion of the 8 bit binary to decimal). If i wanted it to print out that character version should a simple (char) tacked on in front of each of the ((freqs[i]>>n) & 0xFF) blocks work? – user0123 Jul 28 '13 at 21:44
  • No, you'd need to tell `printf` to print a character and not 2 hex digits. Change `%02x` to `%c`. – Retired Ninja Jul 28 '13 at 21:51
  • I'm using his C++ code not his c code, look in the comments to his reply. – user0123 Jul 28 '13 at 22:03
  • @user0123 This answer is not portable, not platform independent. Detecting if the system is little or big endian should be automatic. – Antonio Jul 28 '13 at 22:11
  • If this answer is dependent on how the endianness of the system how do I make sure that it works the same on both systems? – user0123 Jul 28 '13 at 22:19
  • @Antonio - why this answer is not portable? It always prints the bytes in little endian order, even if the underlying hardware is not little endian. The only assumption it makes is that int is 4 bytes. Is it possible that you only read the first half (unportable) instead of the second half? – tohava Jul 28 '13 at 22:21
  • @tohava You are totally right, I should have read this http://stackoverflow.com/questions/1041554/bitwise-operators-and-endianness Can you do a small edit to your answer, I definitely want to revert my vote – Antonio Jul 28 '13 at 22:25
  • 2
    @Antonio The answer isn't perfectly portable: it assumes 8 bit `char`, and `sizeof(int) == 4`. But it's certainly better than most proposed solutions. It will even work on systems with strange byte orders, which are neither big endian nor little endian. – James Kanze Jul 28 '13 at 22:32
0

You can use memcpy which copies a block of memory.

char tab[4] ; 
memcpy(tab, frequencies+i, sizeof(int));

now, tab[0], tab[1], etc. will be your characters.

tomi.lee.jones
  • 1,563
  • 1
  • 15
  • 22
  • hmm thatsa very interested idea! thanks for the input! Can i use one declaration of tab for all unsigned ints in the frequency array or will i need to declare a new tab each time? (basically will memcpy override tab correctly?) Also how does memcpy handle copying something bigger than its target? – user0123 Jul 28 '13 at 21:27
  • @user0123 - This solution won't work if the hardware is big endian. – tohava Jul 28 '13 at 21:30
  • ahh, thats no good. My program must work the same on both systems. – user0123 Jul 28 '13 at 21:41
  • There was nothing about big-endian in the question :) It's true, its only little-endian solution. Glad i could help in any form though. – tomi.lee.jones Jul 28 '13 at 22:19
0

A program to swap from big to little endian: Little Endian - Big Endian Problem.

To understand if your system is little or big endian: https://stackoverflow.com/a/1024954/2436175.

Transform your chars/integers in a set of printable bits: https://stackoverflow.com/a/7349767/2436175

Community
  • 1
  • 1
Antonio
  • 19,451
  • 13
  • 99
  • 197
0

It's not really clear what you mean by "little endian" here. Integers don't have endianness per se; endianness only comes into play when you cut them up into smaller pieces. So which smaller pieces to you mean: bytes or characters. If characters, just convert in the normal way, and reverse the generated string. If bytes (or any other smaller piece), each individual byte can be represented as a function of the int: i & 0xFF calculates the low order byte, (i >> 8) & 0xFF the next lowest, and so forth. (If the bytes aren't 8 bits, then change the shift value and the mask correspondingly.)

And with regards to your second paragraph: a single byte of an int doesn't necessarily correspond to a character, regardless of the encodig. For the four bytes you show, for example, none of them corresponds to a character in any of the usual encodings.

With regards to the last paragraph: to get the binary representation of an unsigned integer, use the same algorithm that you would use for any representation:

std::string
asText( unsigned int value, int base, int minDigits = 1 )
{
    static std::string digits( "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ" );
    assert( base >= 2 && base <= digits.size() );
    std::string results;
    while ( value != 0 || minDigits > 0 ) {
        results += digits[ value % base ];
        value /= base;
        -- minDigits;
    }
    //  results is now little endian.  For the normal big-endian
    std::reverse( results.begin(), results.end() );
    return results;
}

Called with base equal to 2, this will give you your binary representation.

James Kanze
  • 150,581
  • 18
  • 184
  • 329