How can we optimize insertion on ArrayList?

Question

In fact this is a interview question asked a few days ago.

The interviewer wants me to express the difference between ArrayList and LinkedList, and asked to optimize the insertion operation on ArrayList, in other words, to re-implement add(int index, E element) and of course the complexity of get(int index) operation can be sacrificed.

My answer was to separate the array into k sub-arrays and update a counting array representing the number of elements already in the corresponding sub-array. And the memory of every sub-array is allocated dynamically with an expected initial size. When I need to insert a data into the ArrayList, I can locate a sub-array first, and do the operation within a small array. And if insertions are not too frequent or the indexes are uniform distributed, the time complexity of inserting can be O(log(k) + n/k + k) in average, where log(k) means we should locate the sub-array first with binary searching on the counting array's sum array, n/k is for data movement or even memory re-allocation, and k stands for the updating of the sum array.

I'm sure there are better solutions. I do need some suggestions, thanks!

score 1 · Answer 1 · edited Dec 12 '17 at 17:43

1

One of the solutions could be:

add(int index, E element) always add element to the end of array (you have to also store the index where this element should be added) - complexity O(1)
get(int index) has to restore correct order of array (if some elements were added after the last invocation) - knowing the positions in which each element should be, you can restore correct order in O(n)

edited Dec 12 '17 at 17:43

DwB

37,124
11
56
82

answered Jan 23 '17 at 14:01

arvaladze

64
4

The `get(index)` is more likely to be O(n log n) because it will involve a sort. – DwB Dec 12 '17 at 17:51

score 0 · Answer 2 · answered Nov 27 '14 at 02:54

You can implement it in a balanced binary tree, so that both add() and get() cost O(logn)

An example implementation will look like (hand-crafted here, will not compile, corner cases not covered):

class Node {
  int subTreeSize;
  Node left,right;
  Element e;
  // all i 0-indexed
  Node get(int i) {
    if (i >= subTreeSize) {
      return null;
    }
    if (left != null) {
      if(left.subTreeSize > i) {
        return left.get(i);
      } else {
        i -= left.subTreeSize;
      }
    }
    if (i == 0) {
      return this;
    }
    return right.get(i-1);
  }

  // add e to the last of the subtree
  void append(Element e) {
    if(right == null){
      right = new Node(e);
    } else {
      right.append(e);
      right = right.balance();
    }
    subTreeSize += 1;
  }

  // add e to i-th position
  void add(int i, Element e) {
    if (left != null) {
      if(left.subTreeSize > i) {
        add(i,left);
        left=left.balance();
      } else {
        i -= left.subTreeSize;
      }
    }
    if (i == 0) {
      if (left == null){
        left = new Node(e);
      } else {
        left.append(e);
        left = left.balance();
      }
    } else {
      if (right  == null) {
        // also in this case i == 1
        right = new Node(e);
      } else {
        right.add(i-1, e);
        right = right.balance();
      }
    }
    subTreeSize += 1;
  }
  // the common balance operation used in balance tree like AVL or RB
  // usually just left or right rotation
  Node balance() {
    ...
  }
}

public class Tree {
  Node root;
  public Element get(int i) {
    return root.get(i).e;
  }
  public void add(int i, Element e) {
    if (root == null) {
      root = new Node(e);
    } else {
      root.add(i,e);
      root = root.balance();
    }
  }
}

Bernhard Barker · Answer 3 · 2017-12-12T17:43:49.263

A variant of an order statistic tree would allow you to add and get by index in O(log n).

The basic idea is as follows:

Have each node store the size of the subtree rooted at that node.
The index of a node will correspond to its position in the in-order traversal of the tree.

This means that the ordering of the nodes is determined based on where in the tree they appear - this is not the way a binary search tree typically works, where the nodes' elements have some ordering that's not dependent on where in the tree it appears (e.g. f is greater than a in a regular BST ordered lexicographically, but in our case f may be smaller or greater than a, since it's ordered based on the index of f and a).

To add or get, we start at the root and recursively go through the tree, determining whether our insert or lookup position is to the left or right based on the target index and the subtree sizes.

More specifically, we have the following recursive definitions:
(with some added complexity for null nodes and actually inserting the node)

node.add(index, element):
   if index <= left.subtreeSize
      left.add(index, element)
   else
      // anything to the right is after left subtree and current node, so those must be excluded
      right.add(index - left.subtreeSize - 1, element)

node.get(index, element):
   if index == left.subtreeSize
      return node
   if index < left.subtreeSize
      return left.get(index)
   else
      return right.get(index - left.subtreeSize - 1)

To understand this better, the following example tree might be helpful:

Values:                   Indices (in-order pos):   Subtree sizes:
    a                         5                         8
   / \                       / \                       / \
  b   g                     1   6                     5   2
 / \   \                   / \   \                   / \   \
f   c   h                 0   3   7                 1   3   1
   / \                       / \                       / \
  e   d                     2   4                     1   1

If we want to insert a new node at position 5, for example, it will be inserted to the right of d.

Below is a small test program to demonstrate this (creating the tree shown above).

Note that balancing will still need to be done to achieve O(log n) running time per operation.

class Test
{
   static class Node<T>
   {
      Node<T> left, right;
      T data;
      int subtreeCount;
      Node(T data) { this.data = data; subtreeCount = 1; }

      public String toString(int spaces, char leftRight)
      {
         return String.format("%" + spaces + "s%c: %s\n", "", leftRight, data.toString())
                 + (left != null ? left.toString(spaces+3, 'L') : "")
                 + (right != null ? right.toString(spaces+3, 'R') : "");
      }

      int subtreeSize(Node<T> node)
      {
         if (node == null)
            return 0;
         return node.subtreeCount;
      }

      // combined add and get into 1 function for simplicity
      // if data is null, it is an get, otherwise it's an add
      private T addGet(int index, T data)
      {
         if (data != null)
            subtreeCount++;
         if (index == subtreeSize(left) && data == null)
            return this.data;
         if (index <= subtreeSize(left))
         {
            if (left == null && data != null)
               return (left = new Node<>(data)).data;
            else
               return left.addGet(index, data);
         }
         else if (right == null && data != null)
            return (right = new Node<>(data)).data;
         else
            return right.addGet(index-subtreeSize(left)-1, data);
      }
   }

   static class TreeArray<T>
   {
      private Node<T> root;
      public int size() { return (root == null ? 0 : root.subtreeCount); }

      void add(int index, T data)
      {
         if (index < 0 || index > size())
            throw new IndexOutOfBoundsException("Index: " + index + ", Size: " + size());
         if (root == null)
            root = new Node<>(data);
         else
            root.addGet(index, data);
      }

      T get(int index)
      {
         if (index < 0 || index >= size())
            throw new IndexOutOfBoundsException("Index: " + index + ", Size: " + size());
         return root.addGet(index, null);
      }

      @Override
      public String toString() { return root == null ? "Empty" : root.toString(1, 'X'); }
   }

   public static void main(String[] args)
   {
      TreeArray<String> tree = new TreeArray<>();
      tree.add(0, "a");
      tree.add(0, "b");
      tree.add(1, "c");
      tree.add(2, "d");
      tree.add(1, "e");
      tree.add(0, "f");
      tree.add(6, "g");
      tree.add(7, "h");
      System.out.println("Tree view:");
      System.out.print(tree);
      System.out.println("Elements in order:");
      for (int i = 0; i < tree.size(); i++)
         System.out.println(i + ": " + tree.get(i));
   }
}

This outputs:

Tree view:
 X: a
    L: b
       L: f
       R: c
          L: e
          R: d
    R: g
       R: h
Elements in order:
0: f
1: b
2: e
3: c
4: d
5: a
6: g
7: h

Live demo.

score -1 · Answer 4 · answered Apr 06 '13 at 09:21

LinkedList is a linked-list with access\insert\remove requires O(n), linked-lists support sequential access O(n).

ArrayList is an array with insert\remove requires O(2n), but access requires O(1), arrays support random access O(1).

to find a more optimal hybrid structure, you can start with this:

template <T>
public class LinkedArrayList
{
    LinkedList<ArrayList<T>> list;

    public LinkedArrayList ()
    {
        list = new LinkedList<ArrayList<T>> ();
    }

    // ..
}

You'll have to balance segments (arrays) in the list between access complexity, and insert\remove complexity

How can we optimize insertion on ArrayList?

4 Answers4

Linked