-2

If I have a binary file with the following data (written on a platform where 1 byte == 8 bits): 0x01, 0x10, 0x20, 0x40, 0xff

Would the following program be portable to a platform where 1 byte != 8 bits?

#include <iterator>
#include <fstream>

int main()
{
    std::fstream file("binaryfile");
    std::istreambuf_iterator<char> iter{file}, end;
    for (; iter != end; ++iter) {
        char c{*iter};
    }
}

In other words; if the reading platform has 16 bits to a byte, will it read 2 bytes?

Community
  • 1
  • 1
wally
  • 10,717
  • 5
  • 39
  • 72

1 Answers1

-1

Most programs are probably not portable

Reading through the comments here, I think the answer is that this program is not fully portable as per the C++ standard. No program is. edit: At least no program that talks to a network, reads or writes to disk or expects other platforms to answer or update data.

C++ sides with the platform's hardware and is not Java. fgetc and fputc for example, are round trip value preserving, but only on the same platform. Network messages work because everyone assumes 8 bits to a byte.

If there is concern then it would be best to assert that the platform has 8 bits to a byte: static_assert(CHAR_BIT==8, "Platform must have 8 bits to a byte.");

Even without the assert there will be other alarm bells. A platform that does not have 8 bits to a byte, but still talks to other platforms via networking or files will fail earlier than later and porting code to it will require the extra work to read and write data with the assumed 8 bit de facto standard. This seems much like the endianness issue, but the difference here is that one side has very clearly won.

But they could be made portable

Edit: the statement above might not always hold. With appropriate effort the program could be made portable. The following adaptation from Mooing Duck demonstrates how this program and the iterator might consider how to behave with a different number of bits. It shows how a system with more bits might read a file from a system with fewer bits. This could be expanded to work both ways:

#include <iterator>
#include <climits>
#include <iostream>

template<class base_iterator, size_t source_bits>
class bititerator : public std::iterator<std::input_iterator_tag, unsigned char> {
    mutable base_iterator base;
    mutable unsigned char bufferhi;
    mutable unsigned char bufferlo;
    mutable unsigned char bitc;
public:
    bititerator(const base_iterator& b) : base(b), bufferhi(0), bufferlo(0), bitc(0) {}
    bititerator& operator=(const bititerator&b) {base = b.base; bufferlo=b.bufferlo; bufferhi=b.bufferhi; bitc=b.bitc; return *this;}
    friend void swap(bititerator&lhs, bititerator&rhs) {std::swap(lhs.base, rhs.base); std::swap(lhs.bufferlo, rhs.bufferlo); std::swap(lhs.bufferhi, rhs.bufferhi); std::swap(lhs.bit, rhs.bitc);}
    bititerator operator++(int) {bititerator t(*this); ++*this; return t;}
    unsigned char* operator->() const {operator*(); return &bufferlo;}
    friend bool operator==(const bititerator&lhs, const bititerator&rhs) {return lhs.base==rhs.base && lhs.bitc==rhs.bitc;}
    friend bool operator!=(const bititerator&lhs, const bititerator&rhs) {return !(lhs==rhs);}
    unsigned char operator*() const {
        static_assert(source_bits<CHAR_BIT, "bititerator only works on systems with more bits than the target");
        //make sure at least source_bits bits are in the buffers
        if (bitc < source_bits) {
            bufferhi = static_cast<unsigned char>(*base);
            ++base;
            size_t shift = source_bits-bitc;
            bufferlo |= ((bufferhi<<shift)&0xFF);
            bufferhi >>= shift;
            bitc += CHAR_BIT;
        }
        return bufferlo;

    }
    bititerator& operator++() {
        operator*();
        //shift the buffers down source_bits bits
        bufferlo >>= source_bits;
        bufferlo |= ((bufferhi<<(CHAR_BIT-source_bits))&0xFF);;
        bufferhi >>= source_bits;
        bitc -= source_bits;
        return *this;
    }
};

template<class base_iterator>
bititerator<base_iterator,6> from6bit(base_iterator it) {return bititerator<base_iterator,6>(it);}
bititerator<std::istreambuf_iterator<char>,6> from6bitStart(std::istream& str) {return bititerator<std::istreambuf_iterator<char>,6>{std::istreambuf_iterator<char>{str}};}
bititerator<std::istreambuf_iterator<char>,6> from6bitEnd(std::istream& str) {return bititerator<std::istreambuf_iterator<char>,6>{std::istreambuf_iterator<char>{}};}

#include <fstream>
int main()
{
    std::fstream file("binaryfile");
    auto end = from6bitEnd(file);
    for (auto iter = from6bitStart(file); iter != end; ++iter)
        std::cout << *iter;
}
Community
  • 1
  • 1
wally
  • 10,717
  • 5
  • 39
  • 72
  • "No program is" well that's wrong. You're right that serialization to the 8-bit based standards we use for modern files and networking would be hard, but it's a long way from impossible. – Mooing Duck Apr 08 '16 at 21:16
  • @MooingDuck Here's the opportunity to prove me wrong. How would you update the above code to be fully portable as per the C++ standard? – wally Apr 08 '16 at 21:18
  • This only reads from source based on fewer bits, not more, but the same idea can be flexed out to handle both: http://coliru.stacked-crooked.com/a/b569bbd878e68d6d I'm busy so that's untested. – Mooing Duck Apr 08 '16 at 22:39
  • @MooingDuck Wow. I stand corrected. One could write the program (and even the libraries) to be fully portable with regard to the number of bits per byte. – wally Apr 09 '16 at 03:09
  • Yeah. It's not pretty, it's not easy, and it sure ain't fast, but it can be done. – Mooing Duck Apr 09 '16 at 05:05