-1

I have this struct with 2 attributes (one char and one int, for a memory usage of 3 bytes:

struct Node {
    char data;
    int frequency;
}

I try overload the operators << and >> for this struct, for being able to read and write this struct from and into a file using fstream . For the operator << I got:

  friend std::ostream& operator<<(std::ostream& output, const HuffmanNode& e) {
    string data = string(1, e.data) + to_string(e.frequency);
    output << data.data();
    return output;
  };

which makes wondering how much space this returns to the output (3 bytes, as expected? - 1 from the char and 2 from the int?)

when I want save the struct to the file, I got this:

List<Node> inOrder = toEncode.inOrder();
for(int i=1; i<=inOrder.size(); i++) {
  output << inOrder.get(i)->getData();

where each node of the list inOrder and the tree toEncode above are the struct listed before, and iNOrder.get(i)->getData() return it. output is the fstream.

Now, how I do the reading from the file? with the operator >>, what I understand is that it need take an unsigned char array with 3 elements as input, and take the first element (1 byte) and convert to char, and the 2 other elements and convert for an int. Is this correct? Do I can do that with this method:

  friend std::istream& operator>>(std::istream& input, HuffmanNode& e) {
    ...
  };

or I need change the method signature (and parameters)? And for the file reading itself, taking in consideration the characters I need to read are all in the first line of the file, what the code for make the program read 3 characters each time from this line and generating a struct from this data?

update

what I got so far:

for write to file, I implement this operator <<:

  friend std::ostream& operator<<(std::ostream& output, const HuffmanNode& e) {
    union unsigned_data data;
    data.c = e.data;

    union unsigned_frequency frequency;
    frequency.f = e.frequency;

    output << data.byte << frequency.byte;
    return output;
  };

used this way:

List<HuffmanNode> inOrder = toEncode.inOrder();
for(int i=1; i<=inOrder.size(); i++)
  output << inOrder.get(i)->getData();

this seems to work, but I can't be sure without a way to read from the file, which I got this:

operator>>:

  friend std::istream& operator>>(std::istream& input, HuffmanNode& e) {
    union unsigned_data data;
    data.c = e.data;

    union unsigned_frequency frequency;
    frequency.f = e.frequency;

    input >> data.byte >> frequency.byte;
    return input;
  };

used this way:

string line;
getline(input, line);

HuffmanNode node;
stringstream ss(line);
long pos = ss.tellp();
do {
  ss >> node;
  toDecode.insert(node);
  ss.seekp (pos+3);
} while(!ss.eof());

this seems to get stuck on a infinite loop. both operator are using this unions:

union unsigned_data {
  char c;
  unsigned char byte;
};

union unsigned_frequency {
  int f;
  unsigned char byte[sizeof(int)];
};
Kleber Mota
  • 8,521
  • 31
  • 94
  • 188
  • size of `int` in common systems is 4 bytes (32 bits), are you sure of your "3 bytes" usage ? maybe I am misunderstanding something... anyway, you'll have far less problem simply using `fwrite` : `fwrite(&mystruct, sizeof(mystruct), 1 fs);` –  May 29 '22 at 22:38
  • *"how much space this returns to the output (3 bytes, as expected? - 1 from the char and 2 from the int?)"* -- You converted the `int` to a string. Are you saying that you expect a string like `"-123"` to be output as 2 bytes? Or are you saying that you expect that an `int` cannot hold a value above `99` nor below `-9`? – JaMiT May 29 '22 at 22:38
  • @JaMiT What I mean is, since `int` takes 2 or 4 bytes to be stored in memory, it should takes 2 or 4 elements from an `unsigned char` array to store it (adding to the 1 element for the `char` atribute). – Kleber Mota May 29 '22 at 23:13
  • @Sedenion but I could do that with `fstream` and the operator >>? – Kleber Mota May 29 '22 at 23:14
  • @KleberMota But you write the string representing the int. So "-2147483648" as its longest. And your struct takes 8 bytes. – Goswin von Brederlow May 29 '22 at 23:17
  • @GoswinvonBrederlow What should be the right way to do that then? And how the `struct` is taking 8 bytes? `char` = 1 byte and `int` = 2 or 4 bytes, right? – Kleber Mota May 29 '22 at 23:21
  • @KleberMota *"it should takes 2 or 4 elements from an unsigned char array to store it"* -- you have abused the pronoun "it". It should take 2 or 4 elements from an unsigned char array to store **the value** of a 2 or 4 byte integer. However, you are drawing conclusions as if "it" means "the string representation" (as produced by `to_string`) rather than "the value". You can use a single "it" to mean either of these alternatives, but not both. – JaMiT May 29 '22 at 23:24
  • @JaMiT sorry, that's my bad. but what should the correct way to convert this 2 attributes into the `unsigned array` array then? – Kleber Mota May 29 '22 at 23:26
  • @KleberMota What Sedenion wrote, except do that for each member if you want to save space. – Goswin von Brederlow May 29 '22 at 23:27
  • @GoswinvonBrederlow I cannot do that with `fstream` and the operators << and >>? And what the `1 fs` in the code means? – Kleber Mota May 29 '22 at 23:29
  • @JaMiT But the title of this question was: **convert struct to unsigned char through overloaded operator << and >>**. how is this another question? – Kleber Mota May 29 '22 at 23:39
  • @KleberMota "*how the struct is taking 8 bytes? char = 1 byte and int = 2 or 4 bytes, right?*" - yes, but you are not accounting for **alignment padding** that may be present *between* the two fields. [Why isn't sizeof for a struct equal to the sum of sizeof of each member?](https://stackoverflow.com/questions/119123/) – Remy Lebeau May 30 '22 at 01:07
  • @RemyLebeau Ok, but how I do what I ask in the question then? – Kleber Mota May 30 '22 at 01:30
  • @KleberMota see the answer I just posted. – Remy Lebeau May 30 '22 at 01:58
  • @KleberMota Sorry, I sometimes ignore titles (I expect question bodies to stand on their own). You're right, you repeated the question in the title ("convert struct to unsigned char"). It just happens that this question does not match any of the ones in the question body ("how much space this returns to the output", "how I do the reading from the file" -- note that this is the reverse conversion, not mentioned in the title -- "Is this correct", "Do I can do that with [X] or I need [Y]"). Since I was focused on the question body, I had not counted the title among the questions asked. – JaMiT May 30 '22 at 14:35

1 Answers1

1

how much space this returns to the output (3 bytes, as expected? - 1 from the char and 2 from the int?)

No. You are converting the values to std::strings, so they have variable lengths depending on the particular values (ie, "123" takes up a different length than "1234567890"). What you describe applies to the binary storage of the values, not to the textual representation of the values.

Now, how I do the reading from the file? with the operator >>, what I understand is that it need take an unsigned char array with 3 elements as input, and take the first element (1 byte) and convert to char, and the 2 other elements and convert for an int. Is this correct?

No. operator<< and operator>> are primarily meant to be used for formatted (textual) I/O. Your operator<< is actually writing formatted output (though, you don't need to convert the values to std::strings first, you can write them as-is using relevant overloads of operator<<). You just need to write the formatted data in such a way that your operator>> can reverse it. For example:

friend std::ostream& operator<<(std::ostream& output, const HuffmanNode& e) {
    output << int(e.data) << ' ' << e.frequency << ' ';
    return output;
}

friend std::istream& operator>>(std::istream& input, HuffmanNode& e) {
    int i;
    input >> i >> e.frequency;
    e.data = char(i);
    input.ignore();
    return input;
}

Alternatively:

friend std::ostream& operator<<(std::ostream& output, const HuffmanNode& e) {
    output << e.data << e.frequency << '\n';
    return output;
}

friend std::istream& operator>>(std::istream& input, HuffmanNode& e) {
    e.data = input.get();
    input >> e.frequency;
    input.ignore();
    return input;
}

The formatting is really up to you, based on your particular needs.

However, the operators can also be used to read/write binary data, too (just be sure to open the streams in binary mode), eg:

friend std::ostream& operator<<(std::ostream& output, const HuffmanNode& e) {
    output.write(&e.data, sizeof(e.data));
    output.write(reinterpret_cast<const char*>(&e.frequency), sizeof(e.frequency));
    return output;
}

friend std::istream& operator>>(std::istream& input, HuffmanNode& e) {
    input.read(&e.data, sizeof(e.data));
    input.read(reinterpret_cast<char*>(&e.frequency), sizeof(e.frequency));
    return input;
}

This is more in line with what you were thinking of.

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770