0

I'm looking for different solutions, including those in which is forbidden to use .NET libraries and those where I can use all of the advantages of them.

Here is the problem, I have two text files, textFile1 and textFile2. Each of them contains sorted integer numbers(this is the most important condition), like these displayed below :

textFile1      textFile2
0              1
2              3
4              5

I need to create 3rd text file, for example textFile3 by merging those two files, and expected result should be :

textFile3
0
1
2
3
4
5

My first idea was to copy those two text files line by line into two separate arrays and than use solution for merging two sorted arrays in new one, provided in this question. After that, I will copy those members of new array into textFile3, line by line.

Do you have any suggestion ? Maybe better approach ? Please write all of your ideas here, each of them will be helpful to me .

Community
  • 1
  • 1
Hank Mooody
  • 527
  • 11
  • 29
  • 2
    Can't you simply use the idea of merging in merge sort. – Abhishek Bansal Jun 09 '16 at 13:00
  • Is the resulting sequence complete or can it have gaps? – DerApe Jun 09 '16 at 13:00
  • 2
    well theres a bunch of examples of merging two arrays here on stack overflow.. it depends a little on how big the files are as to if thats feasible.. its up to you the coder to work out which is efficient.. we dont provide solutions, only coding assistance in fixing issues. – BugFinder Jun 09 '16 at 13:01
  • "memory consumption O(1)" would imply that the solution uses the same amount of memory regardless of how many elements there are. Is that really a requirement? – stuartd Jun 09 '16 at 13:06
  • @stuartd Yes, it is one of the requirements. – Hank Mooody Jun 09 '16 at 13:07
  • This sounds like a homework problem. – Moby Disk Jun 09 '16 at 13:18
  • You might be interested in my blog series about merging sorted sequences. See http://blog.mischel.com/2014/10/24/merging-sorted-sequences/ – Jim Mischel Jun 09 '16 at 13:37
  • @JimMischel thank you for your link, there are a lot of useful materials on your blog.However, I know how to merge two sorted arrays, question here is how to read files and merge them, should I copy them into arrays and after that merge them into one and copy values into 3rd one, or should I compare values directly from text files and record them into 3rd text file. – Hank Mooody Jun 09 '16 at 13:44

3 Answers3

2

Merging two files is a fairly simple modification to merging two arrays. The idea is to replace the array index increment with a read of the next line of the file. For example, the standard merge algorithm that I show in my blog (http://blog.mischel.com/2014/10/24/merging-sorted-sequences/) is:

while (not end of List A and not end of List B)
    if (List A current item <= List B current item)
        output List A current item
        advance List A index
    else
        output List B current item
        advance List B index

// At this point, one of the lists is empty.
// Output remaining items from the other
while (not end of List A)
    output List A current item
    advance List A index

while (not end of List B)
    output List B current item
    advance List B index

To make that merge files, you start by opening and reading the first line of each file. It gets kind of screwy, though, because you have to check for end of file. "Get the next line" is a bit ... odd.

int item1;
int item2;
bool eof1 = false;
bool eof2 = false;
string temp;
var file1 = File.OpenText(textFile1);
temp = file1.ReadLine();
if (temp == null)
    eof1 = true;
else
    item1 = int.Parse(temp);

// do the same thing for file2

Then we can do the standard merge:

while (!eof1 && !eof2)
{
    if (item1 <= item2)
    {
        outputFile.WriteLine(item1);
        // get next item from file1
        temp = file1.ReadLine();
        if (temp == null)
            eof1 = true;
        else
            item1 = int.Parse(temp);
    }
    else
    {
        // output item2 and get next line from file2
    }
}
// and the cleanup
while (!eof1)
{
    // output item1, and get next line from file1
}
while (!eof2)
{
    // output item2, and get next file from file2
}

The only thing different is that getting the next item is more involved than just incrementing an array index.

Jim Mischel
  • 131,090
  • 20
  • 188
  • 351
2

Merging two ordered sequences can easily be generalized and implemented as extension method like this:

public static class Algorithms
{
    public static IEnumerable<T> MergeOrdered<T>(this IEnumerable<T> seq1, IEnumerable<T> seq2, IComparer<T> comparer = null)       
    {
        if (comparer == null) comparer = Comparer<T>.Default;
        using (var e1 = seq1.GetEnumerator())
        using (var e2 = seq2.GetEnumerator())
        {
            bool more1 = e1.MoveNext(), more2 = e2.MoveNext();
            while (more1 && more2)
            {
                int compare = comparer.Compare(e1.Current, e2.Current);
                yield return compare < 0 ? e1.Current : e2.Current;
                if (compare <= 0) more1 = e1.MoveNext();
                if (compare >= 0) more2 = e2.MoveNext();
            }
            for (; more1; more1 = e1.MoveNext())
                yield return e1.Current;
            for (; more2; more2 = e2.MoveNext())
                yield return e2.Current;
        }
    }
}

Then the concrete task can be accomplished simply with:

static void Merge(string inputFile1, string inputFile2, string outputFile)
{
    Func<string, IEnumerable<KeyValuePair<int, string>>> readLines = file =>
        File.ReadLines(file).Select(line => 
            new KeyValuePair<int, string>(int.Parse(line), line));
    var inputLines1 = readLines(inputFile1);
    var inputLines2 = readLines(inputFile2);
    var comparer = Comparer<KeyValuePair<int, string>>.Create(
        (a, b) => a.Key.CompareTo(b.Key));
    var outputLines = inputLines1.MergeOrdered(inputLines2, comparer)
        .Select(item => item.Value);
    File.WriteAllLines(outputFile, outputLines);
}
Ivan Stoev
  • 195,425
  • 15
  • 312
  • 343
1

They are both sorted lists and to avoid memory consumption, open a reader to both files. Read two line from both, compare ahead, write the sorted results and take action based on the current line of each file. E.g:Treat your sorted value in each file as a pointer and keep comparing and advancing from the lesser side until completion. This will ensure a small memory footprint that will perform better for large files than for smaller ones.

You can pinch an algorithm off of the web, here is one and another that even mentions 0(1). Ignore the fact it talks about arrays, your files are effectively sorted arrays so you don't need to duplicate that in memory.

Murray Foxcroft
  • 12,785
  • 7
  • 58
  • 86