4

I'm in a need to create a data-structure based on Linked list, Array, And constant memory usage.

As an input i get two numbers: N and M.

N represents maximum capacity of a disk-on-key, M represents maximum capacity of a computer hard-drive, So that M>N.

So i need to create a program that 'moves' information from the hard-drive to the disk-on-key, That program needs to implements the following methods:

  1. Insert(data) - Inserts the data into the disk-on-key, If its full it removes the data least important(*): worst case run-time O(1).
  2. delete(data) - deletes the given data from the disk-on-key - O(1)

(*) the user can change the file importance.

Maximum amount of memory usage is O(M)

What i did so far:

I've created an array [1...M] which will 'hold' the computer data, I've created a doubly linked list which will hold the disk-on-key data. [The idea is: each time a data is added to the disk-on-key it will be added to the linked list, and i can go directly to the data using the array as a index(/key) storage.]

My computer data fields:

node firstNode;
node currentNode; 
node[] dataCollection; // Will hold computer hard-drive data

So i wanted to create method replacing the least important data with data i want to add [so i'll be able to use in Insert], My code for replace:

public void replace(int leastImportantdata, int insertData){
    node leastImportant = dataCollection[leastImportantdata];
    if (leastImportant.prev!=null) leastImportant.prev.next=dataCollection[insertData-1];
    if (leastImportant.next!=null) leastImportant.next.prev=dataCollection[insertData-1];
    numOfReplacements++;

So, my main problem is finding the least important data given those two 'combined' data structures and still keeping a run-time of O(1), Especially when the user decides to change the files importance.

  • Say that we start with {4,3,2,1} (numbers represents importance) the least important data would be 1.Suddenly, the user decided to change the last file importance to 5, we get {4,3,2,5} and least important data is 2.

Any idea?

assylias
  • 321,522
  • 82
  • 660
  • 783
StationaryTraveller
  • 1,449
  • 2
  • 19
  • 31
  • What is the range of "Importance"? That is, what is the highest and lowest Importance that can be assigned to a datum? – RBarryYoung Apr 06 '13 at 18:16
  • Why do you feel you need O(1), when O(log N) is almost certainly adequate and doing better than that is virtually impossible? – Hot Licks Apr 06 '13 at 18:16
  • These questions may not be necessary if he paid more attention to his design. Using a SkipList can help produce a very simple and clean solution for this problem. – Igwe Kalu Apr 06 '13 at 18:35
  • 2
    If you could make `insert` O(1) then you would be able to sort in O(N): First insert N items. Nothing needs to be removed yet. Then insert N items whose value is "infinity". Each time you add an "infinity" item, one of the original items will be removed - append that item to the output. Tada, you have done the impossible: Comparison-based sort faster than O(N log N). – Raymond Chen Apr 06 '13 at 18:43
  • @IGwe: but the requirement is not for a "simple and clean solution" it is for O(1) Inserts and O(1) Deletes, which SkipLists do not insure. – RBarryYoung Apr 06 '13 at 18:46
  • @RaymondChen Sorting in O(N) is only impossible in the most general case. There are *many* circumstances and constraints where Sorting in O(N) is not only possible, but the SOP solution. Since the sorting that we are talking about is on "Importance" which sounds a lot like priority, and Priority typically has a relatively small radix, sorting in O(N) may well be possible. (Of course we won't know for sure until the OP answers my question ...) – RBarryYoung Apr 06 '13 at 19:36
  • @RBarryYoung True, but we have no information about the nature of the key. If "importance" is a finite set, then a linked list per importance would work. – Raymond Chen Apr 06 '13 at 19:42
  • @RaymondChen Yes, exactly, a Priority Queue. – RBarryYoung Apr 06 '13 at 20:14
  • @RBarryYoung how's the range of 'importance' matter? – StationaryTraveller Apr 07 '13 at 07:40
  • @IGwe what do you mean by skip list? And how does it help to solve this problem? – StationaryTraveller Apr 07 '13 at 07:41
  • @Raymond Chen Importance if finite, There might be two files (or more) with the same importance, Though the idea of a priority queue might sound. The problem comes when there's a switch (Or new setting of importance by the user) – StationaryTraveller Apr 07 '13 at 07:43
  • @TheAlchemist priority queues do not have constant insert complexity – Osama Javed Apr 07 '13 at 08:55
  • @OsamaJaved I am the one who implied that Priority Queues have constant Insert/Delete. And you are correct, I was wrong to imply that, because, as with Sorting, in the general case it's O(Log N) for Inserts/Deletes. I always get the general concept of Priority Queues confused with the very specific implementation that almost every operating system uses for process scheduling, but whose name I can never remember. Virtually all OS's use a process scheduling data structure and algorithm that has Priority, O(1) Inserts and Deletes and dequeues the highest process for scheduling. – RBarryYoung Apr 07 '13 at 12:49
  • @TheAlchemist Answer the question and I'll be happy to show you: Because if the range/radix of Priority is restricted than it can definitely be done in O(1), otherwise in probably cannot. – RBarryYoung Apr 07 '13 at 12:50
  • @RBarryYoung OS schedulers use different structures based on scheduler types. I think they start off from FIFO Queues and then get more complicated with buckets for different levels of scheduling priorities .. I studied those a couple of years ago so do not remember the specifics :) – Osama Javed Apr 07 '13 at 14:01
  • @OsamaJaved Actually, they almost all use an array of lists. The low level details may vary but the high level structure and approach are virtually identical. It's been considered a solved problem for almost forty years. – RBarryYoung Apr 08 '13 at 00:54
  • @RBarryYoung Let me see if i got you. lets say the user can change the importance by +1 (or +c) each time, makes the problem solvable by fitting the file to a correct place in a priority queue each time i get importance change by the user? – StationaryTraveller Apr 08 '13 at 17:37
  • @TheAlchemist Yes. The key to keeping this O(1) is that the total number of distinct values that Importance can take is small enough to fit into an integer as a bit offset/number (so 64 is probably the practical upper limit for this trick). – RBarryYoung Apr 08 '13 at 19:56
  • @RBarryYoung Sounds Interesting, Can you show me implementation example? – StationaryTraveller Apr 09 '13 at 10:55
  • @TheAlchemist Still waiting for you to answer my question. – RBarryYoung Apr 09 '13 at 23:34

2 Answers2

-1

to be able to figure out least important data you need an order to your list.

So first thought does the ordering of the linked list matter at all? you appear to be getting the data out based on the index and not by traversing the list. (If that order matters the rest of my answer might not be very helpful :D).

This means that you can potentially insert items into the list so that they are sorted in order of least priority which would allow you to obtain the item to delete with constant performance as long as you have a reference to the head of the list.

This unfortunately causes complexity of insert to go up. To fix that you can keep a map of priority to the last ( and potentially the first) file in the linked list with that priority.

Using this map you should be able to instantly figure out where the new file needs to be inserted thereby getting constant performance.

so if you have 3 files P(A) = 1 , P(B) =3 , P(C) = 3, your map would look like 1->(A,A) 3->(B,C) saying that if you want to insert another file with priority 1, it should go after A and if you want to insert a File with priority 2 it should go before B.

Obviously I am assuming a finite number of possible priorities over here and that there are no gaps between used priorities. (that would require searching)

Hope this helps

Osama Javed
  • 1,432
  • 1
  • 16
  • 21
  • Using the SkipList implementation as I suggested, there would be no need for a finite range of priority. – Igwe Kalu Apr 06 '13 at 18:25
  • @IGwe skip list does not have constant complexity – Osama Javed Apr 07 '13 at 03:06
  • I need to design the structure, Basically i can design a 'smart' input organizer. But then again, Once the disk-on-key is full i need to find a way pointing at the index with the least importance. – StationaryTraveller Apr 07 '13 at 07:51
  • @theAlchemist In the approach mentioned above the head of the list will always have the lowest priority and you can use the map to insert new items with their priorities. When u need to change the priority of a node then, you will need to remove it from the list and reinsert it at the correct position. All these operations should have constant complexity. The only issue might be when priorities are e.g 1 10 100 with large gaps between them. – Osama Javed Apr 07 '13 at 08:49
  • @TheAlchemist if you are unsure what i am proposing let me know i will elaborate with an example – Osama Javed Apr 07 '13 at 08:49
  • @TheAlchemist I think the 'smart' input organizer you are suggesting is the same thing as the map that I used ? – Osama Javed Apr 07 '13 at 08:50
  • @Osama, my aim is not to solve TheAlchemist's problem for him. But I do hope I can inspire him to find a very good solution. That said, TheAlchemist could then try to cook up a SkipList implementation that satisfies his constraint or go for a completely different approach. – Igwe Kalu Apr 07 '13 at 12:23
  • @IGwe I guess you could take inspiration from skip list as my answer does but any implementation of a standard skip list wont be usable in this context because of insertion complexities. – Osama Javed Apr 07 '13 at 14:03
  • 1
    @IGwe You are right, I need to use some-kind of dynamic data-structure. I just think the skip-list isn't the best choice, But you might just gave me an idea :-) – StationaryTraveller Apr 08 '13 at 17:29
-1

My suggestions for solving your problems are these:

  • Define the node class to implement the Comparable interface.

  • Implement the data collection as a Skip List - you can do this by using the ConcurrentSkipListMap possibly.

Using the SkipList implementation can ensure that entries are ordered by their importance.

According to the Java API doc - This class (ConcurrentSkipListMap) implements a concurrent variant of SkipLists providing expected average log(n) time cost for the containsKey, get, put and remove operations and their variants.

In general, the implementation allow item lookup with efficiency comparable to balanced binary search trees (that is, with number of probes proportional to log n instead of n).

Finally look [here] for more on SkipList, and how to write your own implementation.

Igwe Kalu
  • 14,286
  • 2
  • 29
  • 39
  • It looks like a great solution to O(log n) insert / delete. Using doubly linked list and an array [or any kind of structure to save the indexex] saving the indexes is much more helpful for a standard insert / delete with O(1) - Since i can just peek at the given array index and either insert or delete the data. – StationaryTraveller Apr 07 '13 at 07:48