2

I have noticed many of the backtracking problems have two ways of solving.

One is to return "whatever's the required list", vs passing-through the "result" of every call and appending to it. What is the downside of returning?(Is it less memory/time efficient) ? Example: To print all possible permutations, what makes this solution inefficient vs the second one? Sorry if this isn't the right forum to ask this, I couldn't find a better place.

public List<List<Integer>> perm(int[] nums){
    List<List<Integer>> result = new ArrayList<List<Integer>>();
    if(nums.length == 0){
        result.add(new ArrayList<Integer>());
        return result;        
    }
    for(int i= 0;i<nums.length;i++){
        int first = nums[i];
        int[] remnums = new int[nums.length-1];
        int j = 0;
        for(int cur : nums){
            if(cur != first){
                remnums[j] = cur;j++;
            }
        }
        List<List<Integer>> tmp = perm(remnums);

        for(List<Integer> t : tmp){
            t.add(0,first);

        }
        result.addAll(tmp);
    }
    return result;
}

2nd approach ---

  public List<List<Integer>> permute(int[] nums) {
   List<List<Integer>> list = new ArrayList<>();
   // Arrays.sort(nums); // not necessary
   backtrack(list, new ArrayList<>(), nums);
   return list;
}

private void backtrack(List<List<Integer>> list, List<Integer> tempList, int [] nums){
   if(tempList.size() == nums.length){
      list.add(new ArrayList<>(tempList));
   } else{
      for(int i = 0; i < nums.length; i++){ 
         if(tempList.contains(nums[i])) continue; // element already exists, skip
         tempList.add(nums[i]);
         backtrack(list, tempList, nums);
         tempList.remove(tempList.size() - 1);
      }
   }
} 
Oguz
  • 1,867
  • 1
  • 17
  • 24
code4fun
  • 2,661
  • 9
  • 25
  • 39
  • https://codereview.stackexchange.com ? – Naman Sep 25 '17 at 01:46
  • ok.what about cs.stackexchange.com ? – code4fun Sep 25 '17 at 01:52
  • Preferably codereview. – Naman Sep 25 '17 at 01:52
  • 1
    created https://codereview.stackexchange.com/questions/176455/two-approaches-to-print-all-permutations-returning-versus-passing-through-the – code4fun Sep 25 '17 at 01:54
  • 2
    I'm voting to close this question as off-topic because this is now moved to https://codereview.stackexchange.com/questions/176455/two-approaches-to-print-all-permutations-returning-versus-passing-through-the – Naman Sep 25 '17 at 01:54
  • This isn't code that needs optimizing, but a choice between two strategies that the asker is trying to understand better. I don't see why this would be off-topic here or better suited on codereview. – m69's been on strike for years Sep 26 '17 at 00:31
  • Have you actually timed your code? That `contains` check in the second example looks like it's going to cause a performance problem. – Jim Mischel Sep 26 '17 at 13:16
  • I havent timed it.But i am more interested in the overall idea of "returning a list" (in general in backtracking porblems)which seems more intuitive to me but maybe less efficient than the other one . – code4fun Sep 27 '17 at 02:22
  • So your real question is what is the different between backtracking and recursive? Not that particular problem right? – Pham Trung Sep 27 '17 at 07:29

2 Answers2

4

I think it’s best to start any discussion of these implementations by talking about the memory usage involved. The number of permutations of an n-element sequence is n!, and since each one has length n, the amount of space required to simply store all the results is going to be O(n · n!), which is a pretty spectacular amount of memory. With n = 14, you’re in territory where you’re risking your result list being so large that Java arrays’ 32-bit indices aren’t big enough to hold your result.

I’m mentioning this because when you’re talking about efficiency at the level you’re doing here - where you have very similar approaches that differ only in whether you use an outparameter or whether you return values - it usually means that performance is absolutely critical or where you’ve identified this as a bottleneck somewhere. If that’s the case, it’s probably worth taking a step back to pause and consider whether the real bottleneck is the efficiency of this detail, or whether it’s the efficiency of generating and holding a monster list in memory.

For example, are you searching for a permutation with a certain property - say, an order in which to do some tasks that meets certain deadlines? Or, are you looking for a permutation minimizing some quantity? In either case, you don’t need to do a two-pass system of generating all possible permutations and then testing each one. In the first case, you can use recursive backtracking in a way where you store only a single permutation at a time, at each point seeing if it has the property you want and stopping as soon as you find one. In the second, you can generate permutations one at a time, only storing the best permutation you’ve found at each point. Both of these approaches dramatically drop the memory requirement, and I suspect that the reduced memory pressure would significantly improve performance.

Plus, if you really are doing this, chances are that your n is sufficiently small that booth approaches won’t take too long to complete, simply because there isn’t that much work to do. In that case, you might just want to say “it doesn’t matter” because your bottlenecks are going to be elsewhere. Maybe this is a bottleneck because you have something like n = 10, or because this is in a tight loop and the inefficiency really does become an issue, but again, in those cases a bigger-picture review of what you’re doing might be in order.

If you’ve committed to storing all the permutations in memory - and you’ve decided that you want to generate them recursively, rather than using an iterative algorithm like C++’s next_permutation, which uses less memory and is likely much faster - then there are still bigger things you can do to get efficiency wins than decide between whether to return the result or use an outparameter.

Notice, for example, in the first function you’ve written, that you’re prepending elements to a bunch of ArrayLists. ArrayLists aren’t optimized for that case - it’s much faster to append to the end of an ArrayList than the beginning - and that change alone probably would give you a bigger performance boost than switching from returning values to passing down an outparameter.

But let’s say that you are absolutely determined to get to the heart of the question and decide between using outparameters versus returning values, and that you’ve already determined that this is likely to be an important source of efficiency and that you absolutely must store all the results in memory.

This boils down to the number of allocations made and how fast the JVM does allocations and garbage collection.

In both approaches, you will be ending up with a list of n! permutations. The memory needed to store those lists is the same in both cases and you can’t avoid it.

In the case where you use an outparameter, you have one final list where results will be stored and one auxiliary list where you store a small, temporary partial list. The only allocations you’re doing are building lists that ultimately get added into the result. This is about as close to the minimal number of allocations as you’re going to get.

The function that returns a list of the results will generate a large number of smaller intermediary lists holding permutations of smaller lists. Those lists aren’t needed once the recursive calls above them finish using them, and so there are a number of allocations made above the baseline.

However, the JVM is usually very good at allocating and reclaiming objects that “die young,” and generational collectors are often optimized for this case. As a result, this might not actually be all that much of a problem - you’d have to profile things to find out.

To summarize:

  • Before you even start asking this question, check whether you even need to. Is it really necessary to hold a list of all permutations in memory? If you do, is the cost of doing so really a bottleneck in your program, or are there bigger fish to fry? And are there other optimization’s or improvements to these functions that you’d make before looking at this detail?

  • The outparameter version of the code does fewer unnecessary allocations, and therefore may be a bit faster than the second. However, JVMs are optimized for short object lifetimes, and so the second approach might not be all that much slower.

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
  • Thanks a lot for such a detailed response ! Actually this was asked to me in an interview and the interviewer was bent on getting towards the second solution.If we ignore the JVM optimimzations ,could you tell theoretically compare them.Is the memory still the same (even though we have some extra small lists everytime),what about time complexity (you can ignore the difference in checking if element already exists). – code4fun Sep 29 '17 at 18:52
  • @code4fun Ignoring JVM optimizations, the second solution makes fewer memory allocations than the first, so it's probably a better solution route to pick. I do think that the intuition behind the first solution is a bit easier to understand, though. – templatetypedef Sep 29 '17 at 18:54
  • Even I find the first one more intuitive.Is there an "order of magnitude" time difference between the two due to the memory allocations (I think you are talking about the extra time taken to copy the "result array from the called func" to the new array to be returned ,is that right? – code4fun Sep 30 '17 at 02:27
0

Both solutions are ineffective as they use recursion (which is a huge memory hog), I suggest converting your first solution to an iterable one which does not rely on stacks (also a memory problem but not going to cause stackoverflow), such as this one: https://stackoverflow.com/a/11471673/5991334

highstakes
  • 1,499
  • 9
  • 14
  • 1
    I doubt you could trigger a stack overflow with this code, since if that happens you’d already be creating such a massive result list that you’d blow out main memory. And the main memory hog here is the space for the resulting list, rather than the stack depth. However, using a non-recursive approach may be much faster here for other reasons. – templatetypedef Sep 28 '17 at 01:11