0

I don't have much experience with java and right now I'm struggling to get my code working. If anyone could help me to get my code working properly I would really appreciate that.

I'm basically using a text input file to find some keys, I made an array with these names:

Azevedo, Ana", "Silva, Rui", "Boussebough, Imane", "Terracina, Giorgio,", "Lefebvre, Peter", "Houghten, Sher", "Revesz, Peter

please note that Lefebvre, Peter is not in the text input file I'm using

here are my main problems with the code:

1) it gives strange results: Binary Search, Target: Lefebvre, Peter Index: 1030 Comparision count: 11

When there is no Lefebvre, Peter in the input file

2) Also interpolation search gives very high comparison counts, from 50 to 139

I don't think the following binarySearch(sortedArray, toFind); in public static int interpolationSearch(String[] sortedArray, String toFind) is right but cant find any other way around.

Ok, hope somebody can help me, here is my program:

 public class findstrings {

static List<String> keys = new ArrayList<String>();
static int binarySearchComparisionCount = 0;
static int interpolationSearchComparisionCount = 0;

public static void main (String args[]) {
    try {
        keys = readFile("ds17sasg2data.txt");
        String [] keysArr = keys.toArray(new String[keys.size()]);

        doQuickSort(keysArr, 0, keys.size() - 1);
        String arr[] = {"Azevedo, Ana", "Silva, Rui", "Boussebough, Imane", "Terracina, Giorgio,",  "Lefebvre, Peter", "Houghten, Sher", "Revesz, Peter"};

        for(String str: arr) {
            System.out.println("Binary Search, Target: " +str + " Index: " + binarySearch(keysArr, str) + " Comparision count: " + binarySearchComparisionCount);
            System.out.println("Interpolation Search, Target: " +str + " Index: " + interpolationSearch(keysArr, str) + " Comparision count: " + interpolationSearchComparisionCount);
            binarySearchComparisionCount = 0;
            interpolationSearchComparisionCount = 0;
        }


    } catch (Exception e) {
        e.printStackTrace();
    }
}

public static List<String> readFile(String filename)
        throws Exception
{
    String line = null;
    List<String> records = new ArrayList<String>();

    // wrap a BufferedReader around FileReader
    BufferedReader bufferedReader = new BufferedReader(new FileReader(filename));

    // use the readLine method of the BufferedReader to read one line at a time.
    // the readLine method returns null when there is nothing else to read.
    while ((line = bufferedReader.readLine()) != null)
    {
        records.add(line.trim());
    }

    // close the BufferedReader when we're done
    bufferedReader.close();
    return records;
}

private static int binarySearch(String[] sortedArray, String target) {
    return binarySearch(sortedArray, target, 0, sortedArray.length - 1);
}

private static int binarySearch(String[] sortedArray, String target, int start, int end) {
    if (start > end)
        return start;
    int mid = (start + end) / 2;
    int c = target.compareTo(sortedArray[mid]);
    ++binarySearchComparisionCount;
    return (c == 0) ? mid : (c < 0) ?
            binarySearch(sortedArray, target, start, mid - 1) :
            binarySearch(sortedArray, target, mid + 1, end);
}


static BigDecimal dist(String str1, String str2) {
    int maxlen = str1.length();
    if (str1.length() < str2.length()) maxlen = str2.length();
    BigDecimal d = BigDecimal.ZERO;
    for (int i=0; i<maxlen; i++) {
        int dist;
        if ( i < str1.length() && i < str2.length() ) {
            dist = str1.charAt(i) - str2.charAt(i);
        }
        else if ( i < str1.length() ) {
            dist = str1.charAt(i);
        }
        else {
            dist = -str2.charAt(i);
        }
        d = d.add(new BigDecimal(dist * Math.pow(2, - i*8)));
    }
    return d;
}


public static int interpolationSearch(String[] sortedArray, String toFind) {
    int low = 0;
    int high = sortedArray.length - 1;
    int mid;
    while (sortedArray[low].compareTo(toFind) <= 0 && sortedArray[high].compareTo(toFind) >= 0) {
        if (sortedArray[high].equals(sortedArray[low]))
            return (low + high) / 2;
        // out of range is possible here
        double value = new Double(dist(toFind, sortedArray[low]).doubleValue()) * (high - low);
        mid = (int)(low + (value) / new Double(dist(sortedArray[high], sortedArray[low]).doubleValue()).intValue());
        ++interpolationSearchComparisionCount;
        if (sortedArray[mid].compareTo(toFind) < 0)
            low = mid + 1;
        else if (sortedArray[mid].compareTo(toFind) > 0)
            high = mid - 1;
        else
            return mid;
    }
    if (sortedArray[low].equals(toFind))
        return low;
        // not found
    else
        return binarySearch(sortedArray, toFind);
}

public static void doQuickSort (String[] array) {
    doQuickSort(array, 0, array.length - 1);
}

public static void doQuickSort (String[] array, int lower, int higher){
    int i=lower;
    int j=higher;
    String pivot=array[lower+(higher-lower)/2];
    while (i<=j){
        while (array[i].compareToIgnoreCase(pivot)<0){
            i++;
        }
        while (array[j].compareToIgnoreCase(pivot)>0){
            j--;
        }
        if (i<=j){
            String t=array[i];
            array[i]=array[j];
            array[j]=t;
            i++;
            j--;
        }
    }
    if (lower<j)
        doQuickSort (array, lower, j);
    if (i<higher)
        doQuickSort (array, i, higher);
}

}

Cosa Ramirez
  • 13
  • 1
  • 8
  • This question seems to be more about fixing your code, which belongs over on Code Review https://codereview.stackexchange.com/ . Please read Stack Overflow's page about How to create a Minimal, Complete, and Verifiable example https://stackoverflow.com/help/mcve to learn how you can change your question to be allowed on this site. – ZomoXYZ Nov 10 '17 at 21:01
  • ok thanks, I will try that – Cosa Ramirez Nov 10 '17 at 21:13

1 Answers1

0

There seems to be some problem in your quickSort() program.

Checkout the standard Arrays.sort() method's output with quickSort() method's output. Once I use the standards sorting, binary search and interpolation search's indexes match.

As I do not know which keys are present in the input file, so I am using the same array as input (but not sorted).

UPDATE: I am not sure if the implementation of dist() method is consistent with data distribution but in general interpolation search's worst time complexity is O(n) which means you can have n comparison for n data points. This bad performance can be either due to non-uniform data distribution or incorrect or bad implementation of distance calculation between data points.

ISSUE: Binary search method had an issue where it wasn't returning -1 when the element was not present in the keysArr, I have fixed the methodbinarySearch(String[] sortedArray, String target, int start, int end)`. After the fix, following keys are not found and have index = -1

  • Terracina, Giorgio, (NOTE: there is an additional comma at end of search word)
  • Lefebvre, Peter
  • Houghten, Sher (NOTE: Houghten, Sheridan exists but not Houghten, Sher)

import java.io.BufferedReader;
import java.io.FileReader;
import java.math.BigDecimal;
import java.util.*;

public class findstrings {

    static List<String> keys = new ArrayList<String>();
    static int binarySearchComparisionCount = 0;
    static int interpolationSearchComparisionCount = 0;

    public static void main(String args[]) {
        try {
            keys = readFile("C:\\workspace\\OCA\\datadfb.txt");
            String[] keysArr = keys.toArray(new String[keys.size()]);

            doQuickSort(keysArr, 0, keys.size() - 1);

            String arr[] = { "Azevedo, Ana", "Silva, Rui", "Boussebough, Imane", "Terracina, Giorgio,",
                    "Lefebvre, Peter", "Houghten, Sher", "Revesz, Peter" };

            System.out.println();

            System.out.printf("Total Elements : %d\n\n", keysArr.length);

            int binaryComplexity = (int) (Math.log(keysArr.length) / Math.log(2));
            int interpolationComplexity = (int) (Math.log(Math.log(keysArr.length) / Math.log(2)) / Math.log(2));
            for (String str : arr) {
                System.out.printf(
                        "Binary Search,        Target: %-20s Index: %-6d Comparision count: %-5d Complexity logn      : %d\n",
                        str, binarySearch(keysArr, str), binarySearchComparisionCount, binaryComplexity);
                System.out.printf(
                        "Interpolation Search, Target: %-20s Index: %-6d Comparision count: %-5d Complexity log(logn) : %d\n",
                        str, interpolationSearch(keysArr, str), interpolationSearchComparisionCount,
                        interpolationComplexity);
                System.out.println();
                binarySearchComparisionCount = 0;
                interpolationSearchComparisionCount = 0;
            }

        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    public static List<String> readFile(String filename) throws Exception {
        String line = null;
        List<String> records = new ArrayList<String>();

        // wrap a BufferedReader around FileReader
        BufferedReader bufferedReader = new BufferedReader(new FileReader(filename));

        // use the readLine method of the BufferedReader to read one line at a time.
        // the readLine method returns null when there is nothing else to read.
        while ((line = bufferedReader.readLine()) != null) {
            records.add(line.trim());
        }

        // close the BufferedReader when we're done
        bufferedReader.close();
        return records;
    }

    private static int binarySearch(String[] sortedArray, String target) {
        return binarySearch(sortedArray, target, 0, sortedArray.length - 1);
    }

    private static int binarySearch(String[] sortedArray, String target, int start, int end) {
        if (start <= end) {
            int mid = -1;
            mid = (start + end) / 2;
            int c = target.compareTo(sortedArray[mid]);
            ++binarySearchComparisionCount;
            return (c == 0) ? mid
                    : (c < 0) ? binarySearch(sortedArray, target, start, mid - 1)
                            : binarySearch(sortedArray, target, mid + 1, end);
        } else {
            return -1;
        }
    }

    static BigDecimal dist(String str1, String str2) {
        int maxlen = str1.length();
        if (str1.length() < str2.length())
            maxlen = str2.length();
        BigDecimal d = BigDecimal.ZERO;
        for (int i = 0; i < maxlen; i++) {
            int dist;
            if (i < str1.length() && i < str2.length()) {
                dist = str1.charAt(i) - str2.charAt(i);
            } else if (i < str1.length()) {
                dist = str1.charAt(i);
            } else {
                dist = -str2.charAt(i);
            }
            d = d.add(new BigDecimal(dist * Math.pow(2, -i * 8)));
        }
        return d;
    }

    public static int interpolationSearch(String[] sortedArray, String toFind) {
        int low = 0;
        int high = sortedArray.length - 1;
        int mid;
        while (sortedArray[low].compareTo(toFind) <= 0 && sortedArray[high].compareTo(toFind) >= 0) {
            if (sortedArray[high].equals(sortedArray[low]))
                return (low + high) / 2;
            // out of range is possible here
            double value = new Double(dist(toFind, sortedArray[low]).doubleValue()) * (high - low);
            mid = (int) (low
                    + (value) / new Double(dist(sortedArray[high], sortedArray[low]).doubleValue()).intValue());
            ++interpolationSearchComparisionCount;
            if (sortedArray[mid].compareTo(toFind) < 0)
                low = mid + 1;
            else if (sortedArray[mid].compareTo(toFind) > 0)
                high = mid - 1;
            else
                return mid;
        }
        if (sortedArray[low].equals(toFind))
            return low;
        // not found
        else
            return binarySearch(sortedArray, toFind);
    }

    public static void doQuickSort(String[] array) {
        doQuickSort(array, 0, array.length - 1);
    }

    public static void doQuickSort(String[] array, int lower, int higher) {
        int i = lower;
        int j = higher;
        String pivot = array[lower + (higher - lower) / 2];
        while (i <= j) {
            while (array[i].compareToIgnoreCase(pivot) < 0) {
                i++;
            }
            while (array[j].compareToIgnoreCase(pivot) > 0) {
                j--;
            }
            if (i <= j) {
                String t = array[i];
                array[i] = array[j];
                array[j] = t;
                i++;
                j--;
            }
        }
        if (lower < j)
            doQuickSort(array, lower, j);
        if (i < higher)
            doQuickSort(array, i, higher);
    }
}

Sample Run:

Total Elements : 2095

Binary Search,        Target: Azevedo, Ana         Index: 142    Comparision count: 9     Complexity logn      : 11
Interpolation Search, Target: Azevedo, Ana         Index: 142    Comparision count: 119   Complexity log(logn) : 3

Binary Search,        Target: Silva, Rui           Index: 1742   Comparision count: 8     Complexity logn      : 11
Interpolation Search, Target: Silva, Rui           Index: 1742   Comparision count: 139   Complexity log(logn) : 3

Binary Search,        Target: Boussebough, Imane   Index: 249    Comparision count: 11    Complexity logn      : 11
Interpolation Search, Target: Boussebough, Imane   Index: 249    Comparision count: 114   Complexity log(logn) : 3

Binary Search,        Target: Terracina, Giorgio,  Index: -1     Comparision count: 11    Complexity logn      : 11
Interpolation Search, Target: Terracina, Giorgio,  Index: -1     Comparision count: 51    Complexity log(logn) : 3

Binary Search,        Target: Lefebvre, Peter      Index: -1     Comparision count: 11    Complexity logn      : 11
Interpolation Search, Target: Lefebvre, Peter      Index: -1     Comparision count: 62    Complexity log(logn) : 3

Binary Search,        Target: Houghten, Sher       Index: -1     Comparision count: 12    Complexity logn      : 11
Interpolation Search, Target: Houghten, Sher       Index: -1     Comparision count: 77    Complexity log(logn) : 3

Binary Search,        Target: Revesz, Peter        Index: 1554   Comparision count: 7     Complexity logn      : 11
Interpolation Search, Target: Revesz, Peter        Index: 1554   Comparision count: 18    Complexity log(logn) : 3
JRG
  • 4,037
  • 3
  • 23
  • 34
  • alright thanks a lot for your help!! I will try that to see if it works. – Cosa Ramirez Nov 10 '17 at 21:26
  • np, please accept the answer and vote up if it helps to resolve your problem! – JRG Nov 10 '17 at 23:53
  • I keep getting the same problems, interpolation search gives high comparison counts – Cosa Ramirez Nov 11 '17 at 18:37
  • Add your file content and results from console so I understand what you mean by high comparisons. Also as I mentioned did you fix your sort logic? If yes then print the sorted array before you start search to ensure your array is really sorted. Do note that Binary Search ONLY works for sorted array and Interpolation Search performs well or can be used ONLY when the data distribution is uniform. – JRG Nov 11 '17 at 18:40
  • ok, this is the output Im getting. I tried your suggestion but didnt work, I dont know what Im doing wrong, anyway here is the output get and after you can find my file content. – Cosa Ramirez Nov 11 '17 at 18:46
  • Binary Search, Target: Azevedo, Ana Index: 142 Comparision count: 9 Interpolation Search, Target: Azevedo, Ana Index: 142 Comparision count: 143 Binary Search, Target: Silva, Rui Index: 1742 Comparision count: 8 Interpolation Search, Target: Silva, Rui Index: 1742 Comparision count: 36 Binary Search, Target: Boussebough, Imane Index: 249 Comparision count: 11 Interpolation Search, Target: Boussebough, Imane Index: 249 Comparision count: 123 – Cosa Ramirez Nov 11 '17 at 18:48
  • Binary Search, Target: Terracina, Giorgio, Index: 1857 Comparision count: 11 Interpolation Search, Target: Terracina, Giorgio, Index: 1857 Comparision count: 115 Binary Search, Target: Lefebvre, Peter Index: 1030 Comparision count: 11 Interpolation Search, Target: Lefebvre, Peter Index: 1030 Comparision count: 91 Binary Search, Target: Houghten, Sher Index: 784 Comparision count: 12 Interpolation Search, Target: Houghten, Sher Index: 784 Comparision count: 103 – Cosa Ramirez Nov 11 '17 at 18:49
  • Binary Search, Target: Revesz, Peter Index: 1554 Comparision count: 7 Interpolation Search, Target: Revesz, Peter Index: 1554 Comparision count: 107 – Cosa Ramirez Nov 11 '17 at 18:49
  • is there a way to add a file on this website? I dont want to make the comment section very unorganized – Cosa Ramirez Nov 11 '17 at 18:50
  • You can add it to the main question post or you can upload the code and file to github and share me the link – JRG Nov 11 '17 at 19:36
  • ok thanks a lot for helping me in this learning experience, hope I can figure out what Im doing wrong. here is the link – Cosa Ramirez Nov 11 '17 at 20:03
  • I have updated the post with latest results and yes i can see high comparison. Interpolation Search worst time complexity is O(n) which could be due to un-uniform data distribution or bad implementation of distance calculation. I am not sure which is your case so can't recommend on reduction of comparison. – JRG Nov 11 '17 at 22:08
  • yeah I cannot figure it out either, but why Im getting the result:: Binary Search, Target: Lefebvre, Peter Index: 1030 Comparision count: 11 When there is no Lefebvre, Peter in the input file ? – Cosa Ramirez Nov 11 '17 at 22:24
  • found issue in binary search, fixed it and now all the elements which doesn't exists in main list have index=-1 for both the searches. Check the post with ISSUE keyword. – JRG Nov 11 '17 at 22:47
  • Check my binary search method above and you will understand the issue, hope this helps! – JRG Nov 11 '17 at 22:50
  • Do you know if there is a method to convert from UTF8 to ASCII after loading the data from the file? – Cosa Ramirez Nov 18 '17 at 02:15
  • Check if this helps https://stackoverflow.com/questions/285228/how-to-convert-utf-8-to-us-ascii-in-java – JRG Nov 18 '17 at 02:35
  • thanks, do you know if it can be implemented in the program with helped me with? – Cosa Ramirez Nov 18 '17 at 03:18
  • It can be implemented, I will recommend you post it as a separate question so viewers of your this question doesn’t get confused with the original question. – JRG Nov 18 '17 at 15:30
  • I posted in another question but it seems like nobody knows how to do it. do you have any idea? – Cosa Ramirez Nov 18 '17 at 21:39