0

I have a query with regard to one of the soln proposed for Algorithm for N-way merge

The soln proposed by member aioobe is as follows:

1. Create a priority queue

2. Iterate through each file f
    enqueue the pair (nextNumberIn(f), f) using the first value as priority key

3. While queue not empty
    1. dequeue head (m, f) of queue
    2. output m
    3. if f not depleted
        1. enqueue (nextNumberIn(f), f)

I did not understand steps 2 and 3 completely. Does step 2 need to read contents of all files into priority queue ? If that's the case, won't memory be a concern?

In step 3, i did not understand 3.3 (if f not deplete, enqueue). Could some one or the OP (aioobe) help me here to understand this. Many thanks.

Community
  • 1
  • 1
irappa
  • 749
  • 4
  • 11

1 Answers1

1

Step 2 reads only the first number from each file. It shouldn't be a memory problem unless you have a ton of files or very large numbers.

Step 3.3 reads the next number after m in the same file that m came from.

Keith Randall
  • 22,985
  • 2
  • 35
  • 54
  • Thanks for clarifying. But I do not think this soln work if the ranges of numbers overlap between the files. i.e. for example if the files F1 F2 and F3 have below content this method doesn't work. F1: 100 200 600 F2: 200 250 300 F3: 230 260 290 – irappa Jun 02 '13 at 06:25
  • @irappa sure the algorithm works like that. The first item from F1 is picked, then the second. (100,200)- afterwards the first from F2 (100,200,250) and so on. Maybe you want to show some code, maybe we can spot the bug. – Thomas Jungblut Jun 02 '13 at 09:28
  • If I want to merge the files F1 F2 and F3 into a single ordered list in file F4, what is the best approach. – irappa Jun 02 '13 at 13:55