0
  1. I have two text files that I read, and combine into one data structure.
  2. I then take the combined data structure and output it to the console using std::cout.
  3. I redirect stdout to file main.exe > my_log.txt

However, for some reason I am getting really weird data output, and I am guessing its something to with the stream/stdout.

Here is what I am getting on output:

  1. The first character(s) in the stream is always ÿþ
  2. Every single character has a 'hidden'/extra '\0' in between every character. Even though I am outputting integers.

One of the input files could also contain the weird characters/behavior So I am thinking the reading of this file might be changing something.

Any ideas on what is the problem?

Here is example code of the exact process I am doing (some checks removed)

#include <fstream>
#include <vector>

struct MyStruct {
  int a;
  int b;
  int c;
};

std::vector<MyStruct> all_structs;

int main() {
  // I do the exact same process for the two input files
  std::ifstream my_file("path/to/file", std::ios::in);
  
  while (!my_file.eof()) {
    std::string line;
    std::getline(my_file, line);
    MyStruct tmp_struct;
    tmp_struct.a = std::stoi(line.substr(0, 4));
    // Repeat for fields b and c
    
    all_structs.push_back(tmp_struct);
  }

  // Now output to console
  // I have tried using std::cout, printf(), converting to char array using sprintf() and then
  // printing. All the same.
  for (auto& s : all_structs) {
    std::cout << s.a << "," << s.b << "," << s.c << std::endl;
  }
}

Edit: I appreciate all the comments and suggestions. I am getting suggestions that we close the question due to duplicates. I would argue that the 'duplicate question' is the solution to this question. The questions are completely different. And any programmer who encounters this problem might not know to search 'Powershell BOM UTF16'.

user2840470
  • 919
  • 1
  • 11
  • 23
  • 2
    Sounds like your input file is UTF-16 with a BOM at the start. – Eljay Dec 06 '21 at 19:24
  • @Eljay how do I change/check that. Not sure if it matters but I am on windows. – user2840470 Dec 06 '21 at 19:26
  • 1
    The output is UTF-16 with BOM. It's not your program doing that, it's the shell. You need to view it in an editor that is compatible with it, for instance notepad. If you want your output file to be a 1:1 binary representation, don't use Powershell for redirection, it handles redirection of streams from programs as text stream! – CherryDT Dec 06 '21 at 19:28
  • 3
    `while (!my_file.eof()) {` can be a cause of 1 extra input: [https://stackoverflow.com/questions/5605125/why-is-iostreameof-inside-a-loop-condition-i-e-while-stream-eof-cons](https://stackoverflow.com/questions/5605125/why-is-iostreameof-inside-a-loop-condition-i-e-while-stream-eof-cons) – drescherjm Dec 06 '21 at 19:28
  • 1
    Does this answer your question? [Changing PowerShell's default output encoding to UTF-8](https://stackoverflow.com/questions/40098771/changing-powershells-default-output-encoding-to-utf-8) – CherryDT Dec 06 '21 at 19:29
  • 1
    Well the way **I** check that is: I open the file in **Vim** and do a `:set fenc?` and `:set bomb?`. – Eljay Dec 06 '21 at 19:29
  • 1
    If you use cmd or git bash instead of powershell, you won't have this problem. – CherryDT Dec 06 '21 at 19:34
  • 2
    To eradicate _one_ of the problems (that @drescherjm mentioned above), change the `while (!my_file.eof()) { std::string line; std::getline(my_file, line); ...` part to `for(std::string line; std::getline(my_file, line);) { ...` – Ted Lyngmo Dec 06 '21 at 19:40
  • What does the file contain? – Captain Hatteras Dec 06 '21 at 20:31

0 Answers0