0

I am trying to get this average running time working fast with processing a big text with searching for Term (say, you see string "Alice" for 5 times then Alice will be stored in pair like Pair("Alice", 5), if another string like "Bob" is seen again for 5 times then Pair("Alice", 5) will go before Pair("Bob",5) because bob is alphabetically smaller than alice and pair with greater occurrence will rank in later position).

So I got the implementation of quicksort all working but it was just too slow for Average time processing. Can someone help me find the reasoning behind this slow processing (by the way, Trend tr already has a bunch of pair data load up with a big .txt file ). I am just trying to quickly process the data and figure out the Nth rank pair in the arraylist.

//method for runtime testing for getNthPopular()
// Average time for getNthPopular();
Trends tr = new StudentTrends();
//...more code 
int n = tr.numEntries(); //number of pairs 
startTime = System.nanoTime();
for (int i = 0; i < n; i++)
    //get the Nth popular string term in the textfile 
    tr.getNthPopular(i);
    endTime = System.nanoTime();
    double test3 = (double)((endTime - startTime)/1000000.0)/(double)n;
    System.out.println("Test 3: " + test3 + " milliseconds /getNthPopular");
//................runtime testing method ends...................

//Here is my implementation of sorting in getNthPopular

@Override
public String getNthPopular(int n) { 
    //first quickSort for Occurrence (descending)
    quickSort(0, rankList.size()-1);
    return rankList.get(n).first;
}

/**
 * quickSorting all occurrence values of each pair in ArrayList rankList
 * @param start starting position of the same occurrence
 * @param end ending position of the same occurrence
 * */
private void quickSort(int start, int end) {
    if (start >= end) return; 
    int pivotIndex = partition(rankList,start, end);
    quickSort(start,pivotIndex-1);
    quickSort(pivotIndex+1, end); 
}

/**
 * partition occurrence (Integer) values based on pivot chosen in two half
 * @param pL arrayList of Google Trends pairs 
 * @param start starting position of the same occurrence
 * @param end ending position of the same occurrence
 * @return the pivotIndex value to recursively dividing into subgroups 
 * */
private int partition(
        ArrayList<Pair<String, Integer>> pL, int start, int end) {
    //By convention, pick the last item of the list 
    //as the pivot
    int occurrencePivot = pL.get(end).second; // occurrence

    int pivotIndex = start; 

    for (int i = start; i < end; i++) {
        //sorting by descending order by using ">=" for occurrence
        if (pL.get(i).second >= occurrencePivot) {
            //sorting by ascending order alphabetically 
            //for the same order in occurrence
            if (pL.get(i).second == occurrencePivot &&
                    pL.get(i).first.toLowerCase().compareTo
                    (pL.get(end).first.toLowerCase()) > 0) {
                pL = swap(pL, i, pivotIndex);
                break; 
            }
            //replacing ith value with 
            //pivotIndex value 
            //swap(pL(i), pL(pIndex)) ends here
            pL = swap(pL, i, pivotIndex);

            pivotIndex = pivotIndex + 1; 
        }
    }
    //swap(pL(pivotIndex), pL(end)) ends here
    pL = swap(pL, pivotIndex, end);

    return pivotIndex;
}

/**
 * swapping pairs in the arrayList of Google Trend pair
 * @param pL arrayList of Google Trends pairs 
 * @param index1 position index1 of first pair 
 * @param index2 position index2 of second pair 
 * @return the arraylist of Google Trend Pairs after swapped
 * */
private static ArrayList<Pair<String, Integer>> swap(
        ArrayList<Pair<String, Integer>> pL, 
        int index1, int index2){
    Pair<String, Integer> tempPair = pL.get(index1);

    pL.set(index1, pL.get(index2));
    pL.set(index2, tempPair); 
    return pL;
}
Nate Lee
  • 17
  • 7
  • Note: you could use the existing [`Collections.swap(pL, i, pivotIndex);`](http://docs.oracle.com/javase/8/docs/api/java/util/Collections.html#swap-java.util.List-int-int-). – Joop Eggen Oct 24 '16 at 21:30
  • I mean I am not allowed to use it in my program. This is a project that I working on for my data structure course. I wish I could :) – Nate Lee Oct 24 '16 at 21:33
  • Okay, then a bit of spice: a swap would also do `pL.set(index2, pL.set(index1, pL.get(index2)));` as set returns the old value. ;) – Joop Eggen Oct 24 '16 at 21:36
  • Why do you repeatedly call quicksort in 'getNthPopular`? Once sorted should suffice? – Joop Eggen Oct 24 '16 at 21:39
  • So @ Joop Eggen do you think swap is the main reasoning slowing down the sorting processing? I thought it also had something to do with my quicksort implementation but I recursion and loop are the only two ways that I could think of in terms of implementing the sorting. I gets sort of frustrating that it took forever to process something. :( – Nate Lee Oct 24 '16 at 21:41
  • Store `pl.get(i)` in a local variable, though when warmed up the JVM will probably catch this optimization too. Also `toLowerCase` is unfortunate; better to store only lowercased names. – Joop Eggen Oct 24 '16 at 21:41
  • Well, unfortunately I did implement pL.set(index2, pL.set(index1, pL.get(index2))); in my swap method but did not seem to speed up the process in any major way. Still nothing shows up. I wonder if it had something to do with my implementation in partition method. – Nate Lee Oct 24 '16 at 21:45
  • To the algorithm I will not say anything; that is your project to think about, but it is not something complicated. Once already sorted? swapping i with pivotIndex == i? – Joop Eggen Oct 24 '16 at 21:47
  • If you *really* want to understand the slow spots in your code, you need to profile it. [This link](http://stackoverflow.com/questions/948549/open-source-java-profilers) has information on a number of free java profilers. – StvnBrkdll Oct 24 '16 at 22:04
  • toLowerCase is to convert all string Term to lower case since Upper case letter in ASCII code is less than lower case, So I have to convert string to lowercase (in case of "Yak" vs "abc", "Yak" is less than "abc" resulted by compareTo method, but I want alphabetical order). – Nate Lee Oct 24 '16 at 22:07
  • The main problem here is this is not a Quicksort at all. The partition method is nothing like a conventional Quicksort,, – user207421 Oct 24 '16 at 22:21
  • @EJP Are you referring to this: the if statement logic under the partition method or you are referring to the entire partition methodology. If that were the case I would try using other implementation of quicksort. – Nate Lee Oct 24 '16 at 22:26
  • I am referrimg to the code inside the partition method. It is unlike any Quicksort implementation I have ever seen in nearly 40 years. – user207421 Oct 24 '16 at 22:36

0 Answers0