0

I have a question about fundamentals in data structures.
I understand that array's access time is faster than a linked list. O(1)- array vs O(N) -linked list
But a linked list beats an array in removing an element since there is no shifting needing O(N)- array vs O(1) -linked list
So my understanding is that if the majority of operations on the data is delete then using a linked list is preferable.
But if the use case is:

  • delete elements but not too frequently
  • access ALL elements

Is there a clear winner? In a general case I understand that the downside of using the list is that I access each node which could be on a separate page while an array has better locality.
But is this a theoretical or an actual concern that I should have?
And is the mixed-type i.e. create a linked list from an array (using extra fields) good idea?
Also does my question depend on the language? I assume that shifting elements in array has the same cost in all languages (at least asymptotically)

Jim
  • 18,826
  • 34
  • 135
  • 254
  • "Is there a clear winner?" No. When in doubt, try different options and time them. It might be useful in the case of linked lists to use some custom allocator that keeps your elements close together – Niklas B. Mar 20 '14 at 18:27
  • @NiklasB.:`use some custom allocator that keeps your elements close together` I have never heard about this techique. How can I find info about an implemention of this? – Jim Mar 20 '14 at 18:30
  • For example http://warp.povusers.org/FSBAllocator/ – Niklas B. Mar 20 '14 at 18:33
  • @NiklasB.:But this assumes a language that you can actually allocate memory yourself.Is this approach language dependent?Is there something similar for languages where there is no memory management (interpretted languages) – Jim Mar 20 '14 at 18:42
  • You can always use indices into some array instead of pointers for a language where you have no pointers, so yes, it's universal. But without raw pointers this might introduce a slight overhead – Niklas B. Mar 20 '14 at 18:44
  • @NiklasB.:Are you talking then about a linked list implemented via an array?(last part of the OP)? – Jim Mar 20 '14 at 18:44
  • "mixed-type i.e. create a linked list from an array (using extra fields)" it is pretty unclear to me what you mean by that, so I can't tell – Niklas B. Mar 20 '14 at 18:53
  • @NiklasB.:I mean an array of objects that has as a next field the index of the next object in the same array. – Jim Mar 20 '14 at 18:56

2 Answers2

2

Singly-linked lists are very useful and can be better performance-wise relative to arrays if you are doing a lot of insertions/deletions, as opposed to pure referencing.

I haven't seen a good use for doubly-linked lists for decades. I suppose there are some.

In terms of performance, never make decisions without understanding relative performance of your particular situation. It's fairly common to see people asking about things that, comparatively speaking, are like getting a haircut to lose weight.

Before writing an app, I first ask if it should be compute-bound or IO-bound. If IO-bound I try to make sure it actually is, by avoiding inefficiencies in IO, and keeping the processing straightforward. If it should be compute-bound then I look at what its inner loop is likely to be, and try to make that swift.

Regardless, no matter how much I try, there will be (sometimes big) opportunities to make it go faster, and to find them I use this technique. Whatever you do, don't just try to think it out or go back to your class notes. Your problem is different from anyone else's, and so is the solution.

Community
  • 1
  • 1
Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135
  • `a lot of insertions/deletions, as opposed to pure referencing.` I assume you mean that updating the list is a lot more frequent than accessing it? – Jim Mar 20 '14 at 19:06
  • @Jim: Well sometimes I don't know until I've written the program what fraction of time is spent modifying/growing the arrays. When I find out it is big [*as in this example*](http://stackoverflow.com/a/927773/23771), then I change the data structure. I guess I'm trying to say - be willing to change the data structure when the performance diagnostic says you should. – Mike Dunlavey Mar 20 '14 at 19:09
0

The problem with a list is not just the fragmentation, but mostly the data dependency. If you access every Nth element in array you don't have locality, but the accesses may still go to memory in parallel since you know the address. In a list it depends on the data being retrieved, and therefore traversing a list effectively serializes your memory accesses, causing it to be much slower in practice. This of course is orthogonal to asymptotic complexities, and would harm you regardless of the size.

Leeor
  • 19,260
  • 5
  • 56
  • 87
  • But the (2) in the use case refers to accessing *ALL* elements not the `Nth` element.To be clear doing a `for(i...size)` over the array or the list – Jim Mar 20 '14 at 18:40
  • @Jim, I didn't say N'th element, but *every* N'th element, as an example where an array could get non-sequential and still be superior. The same problem occurs of course when traversing the entire array/list – Leeor Mar 20 '14 at 19:14