-3

Here is my input file

So I am reading in a .txt file and I keep getting a string index out of bounds exception. I have been trying to find duplicate words and keep the array sorted as I add words to it. I thought my problem was trying to sort and search the array when It has no words or only one word in it. The line with the ** in front of it is the problem line. Its line 129

import java.io.*;
import java.util.Scanner;
import java.util.regex.*;

public class BuildDict 
{

    static String dict[] = new String[20];
    static int index = 0;

    public static void main(String args [])
    {
        readIn();
        print();
    }

    public static void readIn()
    {
        File inFile = new File("carol.txt");
        try
        {
            Scanner scan = new Scanner(inFile);

            while(scan.hasNext())
            {
                String word = scan.next();
                if(!Character.isUpperCase(word.charAt(0)))
                {
                    checkRegex(word);   
                }
            }

            scan.close();
        }
        catch(IOException e) 
        {
            System.out.println("Error");
        }
    }

    public static void addToDict(String word)
    {
            if(index == dict.length)
            {
                String newAr[] = new String[dict.length*2];
                for(int i = 0; i < index; i++)
                {
                    newAr[i] = dict[i];
                }


                if(dict.length < 2)
                {
                    newAr[index] = word;
                    index++;
                }
                else
                {
                    bubbleSort(word);
                    if(!wordHasDuplicate(word))
                    {
                        newAr[index] = word;
                        index++;
                    }
                }

                dict = newAr;
            }
            else
            {
                dict[index] = word;
                index++;
            }




    }

    public static void checkRegex(String word)
    {

        String regex = ("[^A-Za-z]");
        Pattern check = Pattern.compile(regex);
        Matcher regexMatcher = check.matcher(word);

        if(!regexMatcher.find())
        {
            addToDict(word);
        }
    }

    public static void print()
    {
        try 
        {
            FileWriter outFile = new FileWriter("dict.txt");

            for(int i = 0; i < index; i++)
            {
                outFile.write(dict[i]);
                outFile.write(" \n ");
            }

            outFile.close();
        } 

        catch (IOException e) 
        {
            System.out.println("Error ");
        }
    }

    public static void bubbleSort(String word)
    {
        boolean swap = true;
        String temp;
        int wordBeforeIndex = 0;
        String wordBefore;

        while(swap) 
        {
            swap = false;

            wordBefore = dict[wordBeforeIndex];
   for(int i = 0; (i < word.length()) && (i < wordBefore.length()) i++)
            {
                **if(word.charAt(i) < wordBefore.charAt(i))**
                {
                    temp = wordBefore;
                    dict[wordBeforeIndex] = word;
                    dict[wordBeforeIndex++] = temp;
                    wordBeforeIndex++;
                    swap = true;
                }
            }
        }
    }

    public static boolean wordHasDuplicate(String word)
    {
        int low = 0;
        int high = dict.length - 1;
        int mid = low + (high - low) /2;

        while (low <= high && dict[mid] != word)
        {
            if (word.compareTo(dict[mid]) < 0)
            {
                low = mid + 1;
            }
            else
            {
                high = mid + 1;
            }
        }
        return true;





    }
}

Error is shown below:

Exception in thread "main" java.lang.StringIndexOutOfBoundsException:        String index out of range: 2
at java.lang.String.charAt(String.java:658)
at BuildDict.bubbleSort(BuildDict.java:129)
at BuildDict.addToDict(BuildDict.java:60)
at BuildDict.checkRegex(BuildDict.java:90)
at BuildDict.readIn(BuildDict.java:30)
at BuildDict.main(BuildDict.java:14)
Nick
  • 1
  • 3
  • It might help if you told us at which line you get this exception. – Ben van Gompel Oct 22 '15 at 22:25
  • 1
    If your text file has a blank line (ie zero length) `word.charAt(0)` will explode because there is no char 0 – Bohemian Oct 22 '15 at 22:26
  • Oh sorry I get it on line 129 – Nick Oct 22 '15 at 22:26
  • and what is line 129? – Balwinder Singh Oct 22 '15 at 22:27
  • Its reading in all the words correctly and saves them to the array. I thought it would check the first letter of that word – Nick Oct 22 '15 at 22:28
  • hahah sorry about that. Its this line – Nick Oct 22 '15 at 22:28
  • Always post the full stack trace. And before doing so - read it yourself; carefully. You see, those messages tell you ALL you need to know to fix the problem yourself. – GhostCat Oct 22 '15 at 22:28
  • if(word.charAt(i) < wordBefore.charAt(i)) – Nick Oct 22 '15 at 22:28
  • Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: 2 at java.lang.String.charAt(String.java:658) at BuildDict.bubbleSort(BuildDict.java:129) at BuildDict.addToDict(BuildDict.java:60) at BuildDict.checkRegex(BuildDict.java:90) at BuildDict.readIn(BuildDict.java:30) at BuildDict.main(BuildDict.java:14) – Nick Oct 22 '15 at 22:29
  • Idk why its a string index out of bounds tho – Nick Oct 22 '15 at 22:29
  • do you know for sure that wordBefore cant be shorter than word? because you seem to assume that at line 128 and before (for loop) – Philipp Murry Oct 22 '15 at 22:29
  • @Nick When people point out things that are missing from your question, you should edit the question to add the missing bit. When you dump large amounts of text into a comment, it's unformatted and basically unreadable. – azurefrog Oct 22 '15 at 22:32
  • No so what I did was changed it so the for loop ran twice and got the same message and then changed it so the for loop ran once and it went into an infinite loop – Nick Oct 22 '15 at 22:32

1 Answers1

3

Check the length of wordBefore as a second condition of your for loop:

  for(int i = 0; (i < word.length()) && (i < wordbefore.length()); i++)
Paul Ostrowski
  • 1,968
  • 16
  • 21
  • I was just about to post the same answer. Plus 1 – Balwinder Singh Oct 22 '15 at 22:35
  • I changed that and I don't get an exception anymore, thank you. But Its now entering an infinite loop – Nick Oct 22 '15 at 22:40
  • I can't run it locally to find your issue because I don't have your source file, but if you're running in eclipse, run it in debug mode, and after its taking too long, hit the pause button and look at the call stack to see where you are. You should be able to figure at least which loop is taking forever, and fix it. – Paul Ostrowski Oct 22 '15 at 23:03
  • Also, looking at your bubbleSort method, I think you should compare the words themselves, instead of letter [i] of each word within a loop, because your current code will sort "AZ" after "ZA", because the second letter's position "Z" is higher than the "A". so just do this: – Paul Ostrowski Oct 22 '15 at 23:10
  • if (word < wordBefore) – Paul Ostrowski Oct 22 '15 at 23:11
  • And you wordHasDuplicate always returns 'true', you probably only want to return true when word.compareTo(dict[mid]) == 0 – Paul Ostrowski Oct 22 '15 at 23:14
  • I'm sure you're doing this to understand some fundamentals, but most of this functionality is already available to you in a collection object, such as a ArrayList. You can call contains to see if a String is already present, and you can use a compartor to sort the collection see http://stackoverflow.com/questions/6957631/sort-java-collection – Paul Ostrowski Oct 22 '15 at 23:18