2

I have to store the sorted data in a data structure. The data structure I want to use is heap or binary search tree. But I am confused which one would better serve the requirement i.e. fast and efficient searching.

----MORE DETAILS---

I am designing an application that receive data from a source(say a data grid) and then store it into a data structure. The data that comes from data GRID station is in the form of sorted digits. The sorted data can be in ascending or descending order.

now I have to search the data. and the process should be efficient and fast.

Mohamed Salah
  • 959
  • 10
  • 40
user3297557
  • 119
  • 2
  • 3
  • 11
  • 1
    Might help: http://stackoverflow.com/questions/6147242/heap-vs-binary-search-tree – Alex Johnson Feb 11 '14 at 14:19
  • i already checked that.. It is about the storing of data in sorted form. my requirement is efficient search. which one would be better when it comes to search some specific data in datastructure. – user3297557 Feb 11 '14 at 14:21
  • both are good options, use whichever is easy to implement. if you are implementing BST then look for AVL tree also (BST is easy to implement and use than heap according to me) – Nachiket Kate Feb 11 '14 at 14:21
  • @user3297557 Raman is correct below. For searching, I would use BST because they have better efficiency for arbitrary elements. Although if you need a max or min heap is better – Alex Johnson Feb 11 '14 at 14:24
  • A heap is not a good data structure to use if you want to search for arbitrary items. – Jim Mischel Feb 11 '14 at 14:25
  • 1
    There are many data structures, if you can spare memory you can also use Trie data structure. Can you just post your exact requirement? Based on your requirement we can choose the data structures – Karthik Surianarayanan Feb 11 '14 at 14:32
  • @KarthikSurianarayanan Exactly. Apart from searching there is also insertion, removing, etc. You need to consider them all – Michał Feb 11 '14 at 14:33
  • The thing is i want to know whether the data is string (or) something??? – Karthik Surianarayanan Feb 11 '14 at 14:38
  • If the data is coming in sorted and you don't need to update (i.e. add or remove) items, then you can simply store the data in an array and binary search it. That would be fast and efficient (in terms of memory use). – Jim Mischel Feb 12 '14 at 19:09

4 Answers4

7

A heap will only let you search quickly for the minimum element (find it in O(1) time, remove it in O(log n) time). If you design it the other way, it will let you find the maximum, but you don't get both. To search for arbitrary elements quickly (in O(log n) time), you'll want the binary search tree.

Raman Shah
  • 467
  • 2
  • 12
  • anyways data is sorted so that he can traverse through the list i.e. depends on his implementation – Nachiket Kate Feb 11 '14 at 14:24
  • 1
    Good answer. I'd supplement it though by saying you may want to look at a balanced binary tree, depending on what you know about the data. That is, if your data is in order, rebalancing will maintain the efficiency of the tree. – wmorrison365 Feb 11 '14 at 15:02
  • For more info see Skienna at http://www.algorist.com/ or search on "skiena algorithm design manual pdf" for downloadable first edition. – wmorrison365 Feb 11 '14 at 15:03
3

For efficient searching, one would definitely prefer a binary search tree.

To search for a value in a heap may require that you search the entire tree - you can't guarantee that some value may not appear on either the left or right subtree (unless one of the children is already greater than the target value, but this isn't guaranteed to happen).

So searching in a heap takes O(n), where-as it takes O(log n) in a (self-balancing) binary search tree.

A heap is only really preferred if you're primarily interested in finding and/or removing the minimum / maximum, along with insertions.

Either can be constructed in O(n) if you're given already-sorted data.


You mentioned a sorted data structure, but in the "more details" in your question I don't really see that a sorted data structure is required (it doesn't matter too much that that's the form in which your data is already in), but it really depends on exactly what type of queries you will do.

If you're only going to search for exact values, you don't really need a sorted data structure, and can use a hash table instead, which supports expected O(1) lookups.

Bernhard Barker
  • 54,589
  • 14
  • 104
  • 138
2

Let me make a list of potential data structures and we'll elaborate:

  • Binary search tree - it contains sorted data so adding new elements is costly (O(log n) I think). When you search through it you can use the binary search which is O(log n). IT is memory efficient and it doesn't need much additional memory.
  • Hash table (http://en.wikipedia.org/wiki/Hash_table) - every element is stored with a Hash. You can get element by providing the hash. Your elements don't need to be sortable, they only need to provide hashing method. Accessing elements is O(1) which I suppose is pretty decent one :)

I myself usually use hashtables but it depends on what exactly you need to store and how often you add or delete elements.

Check this also: Advantages of Binary Search Trees over Hash Tables

So in my opinion out of heap and binary search list, use Hash table.

Community
  • 1
  • 1
Michał
  • 2,202
  • 2
  • 17
  • 33
  • 1
    A BST requires O(n) memory, so does most other structures, including a hash table - so I'm not sure what you mean by "doesn't need additional memory". Lookup in a hash table takes O(1) (not O(n)), but a hash table isn't a sorted data structure, so presumably doesn't conform to OP's requirements. – Bernhard Barker Feb 11 '14 at 14:34
  • @Dukeling You're right, don't know how I was writing this :) – Michał Feb 11 '14 at 14:38
1

I would go with hash table with separate chaining with an AVLTree (I assume collision occurs). It will work better than O(logn) where n is number of items. After getting the index with hash function, m items will be in this index where m is less than or equal to n. (It is usually much smaller, but never more). O(1) for hashing and O(logm) for searching in AVLTree. This is faster than binary search for sorted data.

Less-Owl-4025
  • 33
  • 1
  • 8