-2

What's the performance penalty that I can expect if I'm using Lists over Arrays to solve the Longest Increasing Subsequence?

Will the dynamic nature of Lists improve average performance because we're not dealing with sizes we won't actually use?

PS: Any tips on improving performance while still maintaining some readability?

   public static int Run(int[] nums)
    {
        var length = nums.Length;

        List<List<int>> candidates = new List<List<int>>();

        candidates.Add(new List<int> { nums[0] });

        for (int i = 1; i < length; i++)
        {
            var valueFromArray = nums[i];

            var potentialReplacements = candidates.Where(t => t[t.Count-1] > valueFromArray);

            foreach (var collection in potentialReplacements)
            {
                var collectionCount = collection.Count;

                if ((collection.Count > 1 && collection[collectionCount - 2] < valueFromArray) || (collectionCount == 1))
                {
                    collection.RemoveAt(collectionCount - 1);
                    collection.Add(valueFromArray);
                }
            }

            if (!candidates.Any(t => t[t.Count - 1] >= valueFromArray))
            {
                var newList = new List<int>();

                foreach(var value in candidates[candidates.Count - 1])
                {
                    newList.Add(value);
                }

                newList.Add(nums[i]);
                candidates.Add(newList);
            }
        }

        return candidates[candidates.Count - 1].Count;
    }
devadviser
  • 120
  • 1
  • 12
  • It is trivial to write the algorithm both ways; do so, measure the performance of both, and then you will have empirical evidence of which is better. – Eric Lippert Oct 21 '17 at 22:13
  • On the other hand, why are you interested in such a small improvement when your solution is far from being efficient? I recommend you to read this https://stackoverflow.com/questions/2631726/how-to-determine-the-longest-increasing-subsequence-using-dynamic-programming or https://stackoverflow.com/questions/6129682/longest-increasing-subsequenceonlogn – Raudel Ravelo Oct 24 '17 at 04:13
  • it's basically the O(N log N) solution with no binary sorting. Also, the question is not for improving this particular implementation. It's more about trying to figure out if Generic Lists are preferable to arrays for this type of problem (eliminates problems with array boundaries, limits memory consumption, built-in readability) – devadviser Oct 24 '17 at 10:39

1 Answers1

1

Depending on the solution the results may vary. Arrays are faster when compared with lists of the same size. How more fast? Lets take a look at the c# solution below. This is a simple O(n^2) solution. I coded a version with arrays only and another one with lists only. I'm running it 1000 times and recording the values for both. Then I just print the average improvement of the array version over the list version. I'm getting over 50% improvement on my computer. Notice that this solution uses arrays and lists with the same sizes always. Than means I never created an array bigger than the size the lists are gonna grow to in the lists version. Once you start creating arrays with a Max size that may not be filled the comparison stops to be fair. C# code below:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;

namespace hashExample
{
    class Program
    {
        static int RunArray(int[] array)
        {
            int[] dp = new int[array.Length];
            dp[0] = 1;
            for (int i = 1; i < array.Length; i++)
            {
                dp[i] = 1;
                for (int j = 0; j < i; j++)
                    if (array[i] > array[j] && dp[i] < dp[j] + 1)
                            dp[i] = dp[j] + 1;
            }
            return dp.Max();
        }

        static int RunList(List<int> array)
        {
            List<int> dp = new List<int>(array.Count);
            dp.Add(1);
            for (int i = 1; i < array.Count; i++)
            {
                dp.Add(1);
                for (int j = 0; j < i; j++)
                    if (array[i] > array[j] && dp[i] < dp[j] + 1)
                        dp[i] = dp[j] + 1;
            }
            return dp.Max();
        }

        static void Main(string[] args)
        {
            int arrayLen = 1000;
            Random r = new Random();
            List<double> values = new List<double>();
            Stopwatch clock = new Stopwatch();
            Console.WriteLine("Running...");
            for (int i = 0; i < 100; i++)
            {
                List<int> list = new List<int>();
                int[] array = new int[arrayLen];
                for (int j = 0; j < arrayLen;j++)
                {
                    int e = r.Next();
                    array[j] = e;
                    list.Add(e);
                }

                clock.Restart();
                RunArray(array);
                clock.Stop();
                double timeArray = clock.ElapsedMilliseconds;
                clock.Restart();
                RunList(list);
                clock.Stop();
                double timeList = clock.ElapsedMilliseconds;
                //Console.WriteLine(Math.Round(timeArray/timeList*100,2) + "%");
                values.Add(timeArray / timeList);
            }
            Console.WriteLine("Arrays are " + Math.Round(values.Average()*100,1) + "% faster");
            Console.WriteLine("Done");
        }



    }
}
Raudel Ravelo
  • 648
  • 2
  • 6
  • 24
  • Thanks for your response. That kind of performance improvement is counterbalanced by the readability (and ease of use) of Lists. However, my question regarding 'using array sizes you're never going to use' is really about the varying size and number of the 'candidate' array used in that particular implementation. – devadviser Oct 23 '17 at 13:17
  • For your particular implementation I would say that even when the candidate arrays will be of size num.Length, you can create them at the beginning and then keep track of the actual size of each for not to waste time going where there are only zeros. But as @EricLippert commented, change it and try it out. – Raudel Ravelo Oct 24 '17 at 04:09
  • Sorry, but you're off the mark. That's exactly what I was getting at, we're allocating memory that we're never really going to use as we need to create arrays that are big enough to solve the worst-case path. – devadviser Oct 24 '17 at 10:41