1

I've been studying about the merge sort algorithm and I can't conclude how many arrays are actually created as part of the algorithm. Certain literature says that the entire original array is copied into a new array that's sorted. But that means only 2 arrays were created.

As far as I understand, the merge sort algorithm has 2 major steps. Splitting and Merging. I thought if you give merge sort an array of 100 slots, it actually creates new arrays as it splits down the middle from 100 > 50/50 > 25/25 > 12/13 > 6/5 > 3/3 > 1/1. After all those splits at this point it has 12 or 13 arrays in memory. And then when finally it copies all these into new arrays from 1 > 2 > 3 or 4 > 6 > 12 > 25 > 50 > 100.

That's another 8 arrays (roughly speaking).

But in text books I read things like this:

You may wonder where all these subarrays are located in memory. In the algorithm, a workspace array of the same size as the original array is created. The subarrays are stored in sections of the workspace array. This means that subarrays in the original array are copied to appropriate places in the workspace array. After each merge, the workspace array is copied back into the original array.

-- Data Structures and Algorithms in Java By Robert Lafore

Doublespeed
  • 1,153
  • 4
  • 16
  • 29
  • You can implement merge sort in place: http://stackoverflow.com/q/2571049/3788176. Or not. So the number of arrays created depends upon the implementation. – Andy Turner Mar 22 '16 at 16:51
  • 1
    These comment provided don't give enough detail. Of course it depends on the implementation. But let's say it's not an in place merge sort (Rocket Science). It's the basic merge sort in CS101, then how many array get created?? – Doublespeed Mar 22 '16 at 16:53

3 Answers3

1

Even if you "split" your array you don't have to create 2 new arrays. You always have the same number of slots (100 here, even if it is 50*2 or 25*4...)

You just move the data at the correct place in the array (after first split the 50 first slots are the first array and the 50 last slots are the second array )

You don't have to store arrays, just indices of where they start in the big array.

Fabich
  • 2,768
  • 3
  • 30
  • 44
1

Consider a pair of final stages of the typical merge sort implementation. We need two arrays - main and auxiliary one, and merge sub-arrays from aux array to the main one.

before some merge stage (| denotes beginning of each array piece (starting index)) - we are merging A subarray with B subarray to fill the first half of destination array, and merging C subarray with D subarray to fill the second half of destination array:

Auxiliary array:  |AAAA|BBBB|CCCC|DDDD
Main array:       ..................

after merge (I use arbitrary ordering)

Auxiliary array:  ...............
Main array:       ABBABBAADCDDCCDC

after copy, before final merge

Auxiliary array:  |ABBABBAA|DCDDCCDC
Main array:      ................
MBo
  • 77,366
  • 5
  • 53
  • 86
  • So it's really only 2 arrays? I'm referring to the recursive approach to merge sort. So the same array is passed to the recursive invocations or Merge-sort and Merge()? – Doublespeed Mar 22 '16 at 17:30
  • Yes, only 2 arrays. Look at the first code here: https://en.wikipedia.org/wiki/Merge_sort#Top-down_implementation - arrays A and B (there destination array of merge phase is aux B) – MBo Mar 22 '16 at 22:00
0

The text book is referring to reasonably efficient implementations of merge where a one time allocation of a workspace array the same size as the original is done. After that point, then all of the sub-arrays to be merged are portions of either the workspace array or the original array, accessed via indices (or pointers).

A top down merge sort uses recursion to generate a dynamic stack of pairs of indices to represent the sub-arrays. A bottom up merge sort skips the recursion, and uses iteration to generate the indices, starting with a sub-array size of 1.

rcgldr
  • 27,407
  • 3
  • 36
  • 61