3

Can anyone advice on what to choose in this situation: I have 100-500(they dynamic, means on every request their number always diffrent) elements which contains element name, type, id. Currently I using multidimensional array

public static Object[,] Item_data = new Object[500, 3];

And then I set data to array:

int found_items = 0;
        foreach (Object m in queryCollection)
        {
            Item_data[found_items, 0] = m[0];

            Item_data[found_items, 1] = m[1];

            Item_data[found_items, 2] = m[0];
            found_plans++;
        }

And I have 8 other same structure arrays which fills different data, it cost around 0.8-1.5 seconds, problem is I need to sort these arrays ASC, DESC by id, by name and by type, if i do manually using another loop to sort data it cost time, so I noticed about List(T) it has sort feature, but its much slower according to these topics:

Performance of Arrays vs. Lists

https://jacksondunstan.com/articles/3058

https://softwareengineering.stackexchange.com/questions/221892/should-i-use-a-list-or-an-array

Is it worth to use List(T) in this situation? Or can anyone recommend something else?

JonZ
  • 139
  • 2
  • 17
  • 3
    Have you benchmarked it? Also relevant read: https://ericlippert.com/2012/12/17/performance-rant/ – UnholySheep Jul 11 '18 at 13:17
  • 1
    If or if not a solution fits for you highly depends on your data and your surrounding. Thus relying on other topics that handle *general* performance-aspects isn´t neccessarily meaningul in your *special* case. At least the third link shows you that it really depends on your use-case which data-structure to use. Apart from this it´s usually not that important to even think about this, as the *actual* performance-bottlenecks are often somewhere else. – MakePeaceGreatAgain Jul 11 '18 at 13:21
  • 1
    For starters, using `Object` for data that probably has a primitive type (like `int`) makes for really inefficient access. And multidimensional arrays make for inconvenient access. Start off with what's obvious in the language first (that is, classes with properties, that you sort using the built-in `.OrderBy`). Only when that's not fast enough should you start thinking about fancy stuff, like, say, sorting the data as it comes in using an online sorting algorithm or a B-tree, instead of afterwards. Also, never forget that databases are a thing, and a lot of clever people have optimized them. – Jeroen Mostert Jul 11 '18 at 13:25
  • @JeroenMostert I use object array because its stores type like RegistryKind,Windows Security and etc, so object was best solution. And I dont have access to sort databases since its very limited, so I have sort on software instead of database part. – JonZ Jul 11 '18 at 13:40
  • What sort code are you using for your blazing fast array? – MineR Jul 11 '18 at 13:41
  • @MineR I don't, Thats why I'm here trying to read community advices on how to do it better – JonZ Jul 11 '18 at 13:51
  • Write literally any code to sort it, and then come back. Sorting 500 items will take like 50ms. – MineR Jul 11 '18 at 14:09
  • If your question is "how do I sort a mulitdimensional array?", then ask that. – MineR Jul 11 '18 at 14:12
  • My point was to analize how todo faster and better than my current code. – JonZ Jul 11 '18 at 14:16
  • To do faster than what? You haven't sorted the list yet. Populating an multi dimensional array with 500x50 elements takes about no time. Read the first link in the comments. You're thinking there's a bottleneck where none exists. It's probably slowish with populating because of the data you're extracting, not the data structure you're using. – MineR Jul 11 '18 at 14:39

2 Answers2

0

If you need fast access to the structure, have you tried using Dictionary<> ? You can sort it using System.Linq

https://msdn.microsoft.com/en-us/library/xfhwa508%28v=vs.110%29.aspx

Bukk94
  • 419
  • 1
  • 6
  • 23
  • Does it do same speed with inserting new values, removing values at specific index id and using type string,int,double, etc.? – JonZ Jul 11 '18 at 13:41
  • They have both the same complexity when adding new items. But Dictionary has much better performance when removing or finding an item (because they are using the key). If you have unique IDs, you can use them as a key. Then finding or removing a specific item will be much faster. – Bukk94 Jul 11 '18 at 13:49
  • what diffrence between Dictionary and List in speed?, seems that dictionary preparation is much simplier. – JonZ Jul 11 '18 at 14:04
  • I don't have actual benchmarks but List is mostly O(n) whereas Dictionary has mostly O(1) complexities (we are talking about seconds vs milliseconds). – Bukk94 Jul 11 '18 at 14:10
  • Ok I think I got all I needed. Thanks. Topic marked as answer. I could give rep+, but reputation too low. – JonZ Jul 11 '18 at 14:14
0

List<T> internally implements T[], just that its flexible and internally keeps expanding when Capacity is over. Since for the all the important operations, it internally does the array operations, just that it provides the convenience for the dynamic expansion.

As per your question:

I have 100-500(they dynamic, means on every request their number always different)

Now ideally a new List<T>(100 / 200) is a better and most optimum choice, since it would dynamically expand for including more data, though please note there's no concept of Multi dimensional List, like Multi dimensional Array. Though you can have something similar to Jagged array T[][], using something like List<List<T>>, but there's no replacement of T[,] using List<T>. Multidimensional array works well for a matrix form of data with defined lower and upper bounds.

This is related to memory optimization, when it comes to various operations, List<T> exposes Array operations (In place sorting), List<T> and T[] also exposes the Linq extension API for IEnumerable<T>, which are not as efficient as in place sorting, due to extra memory allocation and they are done on generic interface instead of specific data structure.

Now regarding various use cases:

  • If it's all about enumeration / data processing sequentially, then both T[] and List<T> are efficient, in fact they provide a binary search option, which makes search for sorted data as O(LogN), much better than O(N), in fact you may consider SortedList<TK,TV>, which is by default sorted, but just that its IDictionary internally, similar is the SortedDictionary<TK,TV>, for very fast element search there's no replacement of IDictionary<TK,TV>, it is O(1).

All the above points are mere theory best way to find the right fit with a combination of memory and performance is to test your use cases with variety of data structures and you will be able to get the right fit for all use cases with minimal compromise

Mrinal Kamboj
  • 11,300
  • 5
  • 40
  • 74
  • Microsoft shows that list can create object, this way I could throw multidimensional arrays to thrash bin and use List https://msdn.microsoft.com/en-us/library/6sh2ey19(v=vs.110).aspx – JonZ Jul 11 '18 at 13:57
  • As I mentioned you need to have a non multi dimensional array use case for `List` to work, even more important is trying various use cases vis a vis popular data structures, like `Dictionary` – Mrinal Kamboj Jul 12 '18 at 11:16