1

I have a really big vector that stores 100000 different values,ranging from 0 to 50000. They represent the cylinders on a hard disk,and I want to sort this vector according to three different algorithms used for disk scheduling. So far,I read those 100000 values from a file,store them into a vector and then sort them according to the desired algorithm(FCFS,SCAN,SSTF).The problem is,it takes too long,because I'm doing it in the least creative way possible:

public static Vector<Integer> sortSSTF(Vector<Integer> array){

  Vector<Integer> positions = new Vector<Integer>(array);

  Vector<Integer> return_array = new Vector<Integer>();

  int current_pos = 0,minimum,final_pos;   

  while(positions.size() > 0){
        minimum = 999999;
        final_pos = current_pos;

        for(int i=0 ; i < positions.size() ; i++){
            //do some math 
        }
    }
    return_array.add(final_pos);
    current_pos = final_pos;
    positions.removeElement(final_pos);
  }
return return_array;
}

My function takes a vector as a parameter,makes a copy of it,does some math to find the desired element from the copied array and store him in the other array,that should be ordered according to the selected algorithm.But in a array with N elements,it is taking N! iterations to complete,which is way too much,since the code should do that at least 10 times.

My question is, how can I make this sorting more efficient?

Grzegorz Piwowarek
  • 13,172
  • 8
  • 62
  • 93
Rodrigo Vasconcelos
  • 1,270
  • 2
  • 14
  • 26
  • 7
    1. Why are you using `Vector` instead of `ArrayList`? 2. Have you heard of `Collections.sort();`? – jlordo May 15 '13 at 18:26
  • 4
    "100000 different values,ranging from 0 to 50000" [they cannot all be different](http://en.wikipedia.org/wiki/Pigeonhole_principle): if you have 100K items and only 50K different values, at least one value will be repeated. – Sergey Kalinichenko May 15 '13 at 18:32
  • you shouldn't use vector if you don't work on c++ – 0x90 May 15 '13 at 18:32
  • 1
    http://stackoverflow.com/questions/1386275/why-is-java-vector-class-considered-obsolete-or-deprecated – 0x90 May 15 '13 at 18:36
  • 1
    The `//do some math` part may be crucial: the nested loop pins your algorithm at `O(N^2)`, so unless "some math" is optimized out to, say, `O(LogN)`, you are not going to see any improvement. – Sergey Kalinichenko May 15 '13 at 18:38
  • Does the "math to find the desired element" always return the same result for the same input? And is the purpose of the sort to return the array in the order of that result? – parsifal May 15 '13 at 18:38
  • The title of your question seems misleading. Do you want a better way to benchmark sorting algorithms over huge vectors or do you want a better sorting algorithm? I think you want the first. Please clarify. In addition, if the algorithms you want to benchmark are O(n!) then you have to critically reduce the size of the vector. There's nothing to do about that, if this is what you want. Please clarify the title and what you really want to do. Finally, your code has more opening braces than closing braces. Please correct that. – naitoon May 15 '13 at 19:27
  • If you call `Collections.sort(positions)` before you enter the `while` loop, you won't need the `for` loop. But if you correctly implement SSTF and provide `current_pos` (or, more correctly, `currentPos`) as an argument, you will need a `for` loop to find the initial indices of the next lowest and next highest cylinders in `positions`. See my answer for details. – rob May 15 '13 at 19:49
  • Looking at SSTF's definition, I think you need a PriorityQueue instead of an array or vector. A linear scan of the whole list for each value seems unlikely since it's too costly, specially for a real-time system like a disk. I can be wrong. – naitoon May 15 '13 at 20:00
  • 1
    @naitoon I may be missing something, but I think there is one underlying problem with using PriorityQueue. It assumes a stable sort, as though current_position never changes. But it changes every time you remove the next item and seek to that position. Since the shortest seek (i.e., cylinder closest to current_position) changes every time the head moves, you'd have to rebuild the PriorityQueue with a new (or modified) Comparator each time you remove the next item from the queue. – rob May 16 '13 at 20:38
  • 1
    @rob you're right, I missed that. Designing an algorithm for updating the queue in that case seems like a fun challenge: +1 for your comment. – naitoon May 16 '13 at 23:31

1 Answers1

3

Java already has built-in methods to sort a List very quickly; see Collections.sort.

Vector is old and incurs a performance penalty due to its synchronization overhead. Use a List implementation (for example, ArrayList) instead.

That said, based on the content of your question, it sounds like you're instead having difficulty implementing the Shortest Seek Time First algorithm.

See related question Shortest seek time first algorithm using Comparator.

I don't think you can implement the SSTF or SCAN algorithm if you don't also supply the current position of the head as an argument to your sorting method. Assuming the initial value of current_postion is always 0 will just give you a list sorted in ascending order, in which case your method would look like this:

public static List<Integer> sortSSTF(List<Integer> cylinders) {
    List<Integer> result = new ArrayList<Integer>(cylinders);
    Collections.sort(result);
    return result;
}

But that won't necessarily be a correct Shortest Seek Time First ordering if it's ever possible for current_pos > 0 when you first enter the method. Your algorithm will then probably look something like this:

  1. Collections.sort(positions);
  2. find the indices in positions that contain the nextLowest and nextHighest positions relative to current_pos (or currentPos, if following Java naming conventions)
  3. whichever position is closer, remove that position from positions and add it to return_array (If it was nextLowest, also decrement nextLowestIndex. If it was nextHighest, increment nextHighestIndex)
  4. repeat step 3 until positions is empty
  5. return return_array.

Of course, you'll also need to check for nextLowestIndex < 0 and nextHighestIndex >= positions.size() in step 3.

Note that you don't need the for loop inside of your while loop--but you would use that loop in step 2, before you enter the while loop.

Community
  • 1
  • 1
rob
  • 6,147
  • 2
  • 37
  • 56
  • 1
    Thanks for taking your time to help me.I wasn't aware of `Collections.sort()`,it really helped the performance a lot.I used vector because I originally wrote the code in c++,so it seemed right at the time.Using a `ListArray` and the sort() method,the code runs through in a few minutes,instead of a few hours! – Rodrigo Vasconcelos May 16 '13 at 20:11