2

So I can search for a string in my text file, however, I wanted to sort data within this ArrayList and implement an algorithm. Is it possible to read from a text file and the values [Strings] within the text file be stored in a String[] Array.

Also is it possible to separate the Strings? So instead of my Array having:

[Alice was beginning to get very tired of sitting by her sister on the, bank, and of having nothing to do:]

is it possible to an array as:

["Alice", "was" "beginning" "to" "get"...]

.

    public static void main(String[]args) throws IOException
    {
        Scanner scan = new Scanner(System.in);
        String stringSearch = scan.nextLine();

        BufferedReader reader = new BufferedReader(new FileReader("File1.txt"));
        List<String> words = new ArrayList<String>();

        String line;
        while ((line = reader.readLine()) != null) {                
            words.add(line);
        }

        for(String sLine : words) 
        {
            if (sLine.contains(stringSearch)) 
            {
                int index = words.indexOf(sLine);
                System.out.println("Got a match at line " + index);

            }
         }

        //Collections.sort(words);
        //for (String str: words)
        //      System.out.println(str);

        int size = words.size();
        System.out.println("There are " + size + " Lines of text in this text file.");
        reader.close();

        System.out.println(words);

    }
user1883386
  • 99
  • 1
  • 4
  • 13

2 Answers2

4

Also is it possible to separate the Strings? Yes, You can split string by using this for white spaces.

 String[] strSplit;
 String str = "This is test for split";
 strSplit = str.split("[\\s,;!?\"]+");

See String API

Moreover you can also read a text file word by word.

 Scanner scan = null;
 try {
     scan = new Scanner(new BufferedReader(new FileReader("Your File Path")));
 } catch (FileNotFoundException e) {
     e.printStackTrace();
 }

 while(scan.hasNext()){
     System.out.println( scan.next() ); 
 }

See Scanner API

Smit
  • 4,685
  • 1
  • 24
  • 28
  • What about commas etc between words? Your split doesn't cater for punctuation, which the example sentence contains (so it's not even theoretical) – Bohemian Jan 09 '13 at 00:33
  • 1
    you could list all necessary stop-chars in split like this: str.split([\\s,.!\\?]*) – Archer Jan 09 '13 at 00:45
4

To split a line into an array of words, use this:

String words = sentence.split("[^\\w']+");

The regex [^\w'] means "not a word char or an apostrophe"

This will capture words with embedded apostrophes like "can't" and skip over all punctuation.

Edit:

A comment has raised the edge case of parsing a quoted word such as 'this' as this.
Here's the solution for that - you have to first remove wrapping quotes:

String[] words = input.replaceAll("(^|\\s)'([\\w']+)'(\\s|$)", "$1$2$3").split("[^\\w']+");

Here's some test code with edge and corner cases:

public static void main(String[] args) throws Exception {
    String input = "'I', ie \"me\", can't extract 'can't' or 'can't'";
    String[] words = input.replaceAll("(^|[^\\w'])'([\\w']+)'([^\\w']|$)", "$1$2$3").split("[^\\w']+");
    System.out.println(Arrays.toString(words));
}

Output:

[I, ie, me, can't, extract, can't, or, can't]
Community
  • 1
  • 1
Bohemian
  • 412,405
  • 93
  • 575
  • 722
  • What if I have to get the string which is in single quote like 'here'? – Smit Jan 09 '13 at 00:53
  • @smit That edge case can't be catered for using `split()`, because split specifies what's *between* the words, and this would have to check what's *around* the words. You would have to first remove such apostrophes. See edited answer. – Bohemian Jan 09 '13 at 02:57
  • I asked you this because of my curiosity, I never intended to attack. I really wanted to know that it could be done with just using regex or not. **My Apology**. – Smit Jan 09 '13 at 16:32
  • @Bohemian well aware winter bash is over - but [in the quest of proving a high performing answer, is this useful or not?](http://stackoverflow.com/a/14243580/1389394) :) – bonCodigo Jan 09 '13 at 18:34