Can I be sure that 32 bytes of binary data read from a file is equal to 256 bits?

Question

I want to read a file 32 bytes at a time using a C/C++ program, but I want to be sure that the data will be 256 bits. In essence I am worried about leading bits in the "bytes" that I read from the file being off ? Is that even a matter of concern ? Example : If I have a number say 2 represented in binary as 10 . This would be sufficient for me as a human. How is that different as far a computer is concerned if it's written as: 00000010 to represent a char value of 1 byte ??? Would the leading zeros affect the bit count ? Does that in turn affect operations like XOR ? I've trouble understanding its effects ! Does that involve data loss ? I really do not know... !

Every help to clear my misunderstanding will be appreciated !!!

What are the contents of the file? Is it actually 32 bytes you're reading? Or is it '0' or '1' characters (possibly 256 of them, possibly not) that you're reading and converting into numbers? — Kevin, Dec 03 '19 at 22:35
I would suggest you ignore this for the moment. Write some simple code to read and write data to a file and see if you run into trouble. If so, ask for more specific help. — David Schwartz, Dec 03 '19 at 22:37
1 byte is the smallest addressable data type in C/C++. `char` is 1 byte in size. But 1 byte is not necessarily 8 bits, though that is the most common size used in most modern platforms. Check your compiler's `CHAR_BIT` macro in `` for the actual number of bits per byte. Or use a data type that is guaranteed to be 8 bits, like `uint8_t` — Remy Lebeau, Dec 03 '19 at 22:41

Eric Postpischil · Accepted Answer · 2019-12-03T22:54:14.790

Every routine in the C standard library that reads from a file or stream reads in units of bytes. Each byte read is a fixed number of bits; what is read from a file does not vary due to leading zeros or lack thereof in a byte. Some routines return a single character (which is a byte). Some routines put data read into a buffer and return a count of bytes read. Some routines, such as scanf, return a count of the number of items successfully converted. (You generally would not use these routines to read a fixed number of bytes.)

The number of bits in a byte is set by the C implementation. It is reported in CHAR_BIT, defined in <limits.h>. It is eight in all common C implementations.

Although the number of bits per byte does not vary, the number of bytes read from a stream can vary. If a stream is opened as a text stream, characters may be “added, altered, or deleted on input and output to conform to differing conventions for representing text in the host environment” (C 2018 7.21.2 2). To avoid this, you should open a stream as a binary stream.

score 0 · Answer 2 · answered Dec 03 '19 at 22:42

The CHAR_BIT macro (defined in climits) will tell you how many bits make up a byte in the execution environment. However I am not aware of any recent general-purpose hardware that uses bytes of other than 8 bits. Some specialized processors (such as for digital signal processing) may use other sizes. Also completely outdated equipment used a wide variety of sizes (the typical alternative to 8 bits being 9).

Spencer · Answer 3 · 2019-12-03T22:51:38.240

0

No

C++ allows a char to be any size a platform requires. However, the macro CHAR_BIT always has the number of bits in a char.

So, to find out the number of bits in 32 bytes, you would use the formula 32*CHAR_BIT.

C++17 has introduced the new type std::byte that is not a character type and is always CHAR_BIT bits, as explained in the SO question std::byte on odd platforms

In order to find the number of bytes needed to hold 256 bits, you have a problem, because CHAR_BIT isn't always a divisor of 256. So, you have to decide what you want and use a more complicated formula. For example, 1+(255+CHAR_BIT)/CHAR_BIT will give you the number of bytes needed to hold 256 contiguous bits.

edited Dec 03 '19 at 22:51

answered Dec 03 '19 at 22:43

Spencer

1,924
15
27

@Lebeau It's more complicated than that; see my edit. – Spencer Dec 03 '19 at 22:52
I would use `(256+(CHAR_BIT-1))/CHAR_BIT` instead. But point taken. – Remy Lebeau Dec 03 '19 at 23:01

Can I be sure that 32 bytes of binary data read from a file is equal to 256 bits?

3 Answers3

No