2

In class I've been tasked with writing a C program that decompresses a text file and prints out the characters it contains. Each character in the file is represented by 2 bits (4 possible characters).

I've recently been informed that a byte is not necessarily 8 bits on all systems, and a char is not necessarily 1 byte. This then makes me wonder how on earth I'm supposed to know how many bits got loaded from a file when I loaded 1 byte. Also how am I supposed to keep the loaded data in memory when there are no data types that can guarantee a set amount of bits.

How do I work with bit data in C?

Hubro
  • 56,214
  • 69
  • 228
  • 381

3 Answers3

2

A byte is not necessarily 8 bits. That much is certainly true. A char, on the other hand, is defined to be a byte - C does not differentiate between the two things.

However, the systems you will write for will almost certainly have 8-bit bytes. Bytes of different sizes are essentially non-existant outside of really, really old systems, or certain embedded systems.

If you have to write your code to work for multiple platforms, and one or more of those have differently sized chars, then you write code specifically to handle that platform - using e.g. CHAR_BIT to determine how many bits each byte contains.

Given that this is for a class, assume 8-bit bytes, unless told otherwise. The point is not going to be extreme platform independence, the point is to teach you something about bit fiddling (or possibly bit fields, but that depends on what you've covered in class).

Michael Madsen
  • 54,231
  • 8
  • 72
  • 83
  • A `char` is 1 byte, always. (Your other note, "a byte is not necessarily 8 bits" is true.) The standard notes that `unsigned char` "shall be represented using a pure binary notation" and in a note describes that, including the statement "A byte contains `CHAR_BIT` bits, and the values of type `unsigned char` range from 0 to 2 (to the power of) (`CHAR_BIT` − 1)." (Earlier, it requires `char` and `signed char` and `unsigned char` to occupy the same amount of storage size. Section 6.2.5 "Types" (Note: I'm using the draft here: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf ) – Thanatos Sep 18 '11 at 02:07
  • Also, from Guy Sirton's answer, see http://stackoverflow.com/questions/437470/type-to-use-to-represent-a-byte-in-ansi-c89-90-c, which gives another (perhaps better) argument as to why a `char` is a byte. – Thanatos Sep 18 '11 at 02:12
  • @Thanatos: Err, yeah, of course. I really need to stop writing answers at 3AM. Editing... – Michael Madsen Sep 18 '11 at 10:35
  • Lol, something we've all probably been guilty of at one point or another. :-) +1. – Thanatos Sep 18 '11 at 18:27
1

You can use bit fields in C. These indices explicitly let you specify the number of bits in each part of the field, if you are truly concerned about width. This page gives a discussion: http://msdn.microsoft.com/en-us/library/yszfawxh(v=vs.80).aspx

As an example, check out the ieee754.h for usage in the context of implementing IEEE754 floats

Foo Bah
  • 25,660
  • 5
  • 55
  • 79
1

This then makes me wonder how on earth I'm supposed to know how many bits got loaded from a file when I loaded 1 byte.

You'll be hard pressed to find a platform where a byte is not 8 bits. (though as noted above CHAR_BIT can be used to verify that). Also clarify the portability requirements with your instructor or state your assumptions.

Usually bits are extracted using shifts and bitwise operations, e.g. (x & 3) are the rightmost 2 bits of x. ((x>>2) & 3) are the next two bits. Pick the right data type for the platforms you are targettiing or as others say use something like uint8_t if available for your compiler.

Also see: Type to use to represent a byte in ANSI (C89/90) C?

I would recommend not using bit fields. Also see here:

When is it worthwhile to use bit fields?

Community
  • 1
  • 1
Guy Sirton
  • 8,331
  • 2
  • 26
  • 36
  • To the down-voter: why the down vote? This is one of the better answers on this question! (And certainly the most practical so far.) – Thanatos Sep 18 '11 at 02:10