-1

I'm trying to merge two files that contain some numbers into a third file but I'm not getting the right result.

This is my code:

void merge(string input_file1, string input_file2, string output_file){
    fstream fs1; 
    fstream fs2; 
    fstream fs3;
    int n1, n2;  
    fs1.open(input_file1); 
    fs2.open(input_file2);
    fs3.open(output_file); 

    while(fs1 >> n1 && fs2 >> n2){
        if(n1 < n2){
            fs3 << n1 << " ";
            fs1 >> n1;
        }
        else{
            fs3 << n2 << " "; 
            fs2 >> n2;
        }
    }
    while(fs1 >> n1)
        fs3 << n1 << " ";
    
    while(fs2 >> n2) 
        fs3 << n2 << " ";
}

input:

input file1: 1 2 3 4 5 6 7

input file2: 34 56 77 78 88 90 100

output file: 1 3 5 7 88 90 100

Fareanor
  • 5,900
  • 2
  • 11
  • 37
sam
  • 79
  • 6
  • Have you considered reading both files into an stl container (std::list for example) and then calling `sort` on it? – Refugnic Eternium Nov 28 '22 at 13:44
  • have you used a debugger to see whats wrong in the code? – 463035818_is_not_an_ai Nov 28 '22 at 13:44
  • your code assumes that both files are sorted, but they are not. Your title also says they would be sorted, but your example input not – 463035818_is_not_an_ai Nov 28 '22 at 13:45
  • that's what I want to avoid as it will be some extra steps and code and that will make the code inefficient so I thought I'd do it straight from and to the file @RefugnicEternium – sam Nov 28 '22 at 13:45
  • The issue is, that you keep reading 'the next number' from both files, regardless of whether one was lower than the one you read last. You'll need some buffering and only read 'the other file' once you've found the right place of the number you've read last. – Refugnic Eternium Nov 28 '22 at 13:46
  • @463035818_is_not_a_number it doesn't matter still the same problem. I just edited the input and output take a look, please – sam Nov 28 '22 at 13:49
  • You read twice per loop, and as @Refugnic Eternium pointed out, also read when you did not write anything. I strongly suggest you print the values in each loop to get a simple feedback of what's going on. – Cedric Nov 28 '22 at 13:52
  • 1
    @sam *that's what I want to avoid as it will be some extra steps and code and that will make the code inefficient* -- Do you really think that writing a file in bits and pieces as you have done is efficient? Memory is much faster than file I/O -- reading those numbers in a container *one time*, calling sort *one time*, and then writing *one time*, will more than likely be faster than what you are attempting. – PaulMcKenzie Nov 28 '22 at 13:52
  • @PaulMcKenzie yeah probably that's the easiest way but I wanted to challenge my way of thinking and thought I could do it this way. Notice that the actual file is much bigger than that. it's probably hundreds of thousands of numbers but those two input files are just some testing files – sam Nov 28 '22 at 13:57
  • of course it does matter. If you put garbage in you can expect to get garbage out. Only if the preconditions are met (input is sorted) you can expect to get meaningful output – 463035818_is_not_an_ai Nov 28 '22 at 14:07
  • @sam -- What you are doing isn't the way to go about this. There are sorting algorithms that work with data that may not fit into memory. At some point, you must buffer, even a a few items, the data somewhere. [See this](https://stackoverflow.com/questions/20802396/how-external-merge-sort-algorithm-works). The idea of sorting huge amounts of data using a device that has limited memory has been already addressed (and solved) by creating external sorting algorithms. – PaulMcKenzie Nov 28 '22 at 14:08
  • Is there a reason your output does not contain numbers like 2, 4, 6, 34, 56, 77, 78? Or is that the wrong output you are getting currently and you want to get an output that actually contains all numbers? – Ranoiaetep Nov 28 '22 at 17:16
  • @Ranoiaetep I wanted the files to be sorted into one output file it didn't work at first due to a bug in my code but one of the guys here helped me figure it out – sam Nov 28 '22 at 17:58

3 Answers3

2

If n1 < n2, you do fs1 >> n1 twice (once immediately and then once in the loop condition) and discard both the first value and n2; if n1 >= n2, you do fs2 >> n2 twice and discard both its first value and n1.

You can't do any unconditional reading in the "selection loop", as you should only replace the smallest number.

Something like this (untested):

// Potentially read the first two numbers.
fs1 >> n1;
fs2 >> n2;
// This loop is entered only if there were two numbers.  
// Note that the last successfully read number is written only for the shorter input (if any).
while (fs1 && fs2)
{
    if (n1 < n2)
    {
        fs3 << n1 << ' ';
        fs1 >> n1;
    }
    else
    {
        fs3 << n2 << ' ';
        fs2 >> n2;
    }
}
// Write-then-read because the last value read has not been written yet.
while (fs1)
{
    fs3 << n1;
    fs1 >> n1;
}
while (fs2)
{
    fs3 << n2;
    fs2 >> n2;
}
molbdnilo
  • 64,751
  • 3
  • 43
  • 82
1

There is already a very good an accepted answer by user molbdnilo. Good.

Maybe, to be complete, I will add a "more modern" C++ solution.

Please see the very simple example:

#include <iostream>
#include <fstream>
#include <vector>
#include <iterator>
#include <algorithm>
using namespace std::string_literals;

void merge(const std::string& in1, const std::string& in2, const std::string& out) {
    if (std::ifstream ifs1{ in1 }, ifs2(in2); ifs1 and ifs2)
        if (std::ofstream ofs{ out }; ofs) {
            
            std::vector data{ std::istream_iterator<int>(ifs1), {} };
            data.insert(data.end(), std::istream_iterator<int>(ifs2), {});
            std::sort(data.begin(), data.end());
            std::copy(data.begin(), data.end(), std::ostream_iterator<int>(ofs, " "));
        }
        else std::cerr << "\nError: Could not open input files\n\n";
    else std::cerr << "\nError: Could not open output files\n\n";
}
int main() {
    merge("in1.txt"s, "in2.txt"s, "out.txt"s);
}
A M
  • 14,694
  • 5
  • 19
  • 44
  • thank you but I didn't learn about iterators stream iterators yet so this looks intimidating to me haha – sam Nov 28 '22 at 14:47
0

@A_M already gave a good solution utilizing C++ standard library, however you can further modernize them in C++20 by using views::istream and range version of algorithms:

#include <algorithm>
#include <fstream>
#include <filesystem>
#include <ranges>

void merge(const std::filesystem::path& in1, const std::filesystem::path& in2, const std::filesystem::path& out)
{
    auto in1_stream = std::ifstream(in1), in2_stream = std::ifstream(in2);
    auto out_stream = std::ofstream(out);
    std::ranges::merge(
            std::views::istream<int>(in1_stream), std::views::istream<int>(in2_stream),
            std::ostream_iterator<int>(out_stream, " ")
    );
}

int main()
{
    merge("input1.txt", "input2.txt", "output.txt");
}
Ranoiaetep
  • 5,872
  • 1
  • 14
  • 39