-1

I'm new to parsing (and new to Java), so I just want to be sure I'm conceptualizing this correctly.

I've written a program that allows the user to paste text into a text field, then click the "parse data" button. This button parses the text for two pieces of data: a name and a number.

The text data is generally pasted by the user in this form:

john 48915
beth 10431
frank 10112
anne 34887
taserface 90090
bill 56448

I'm using the regular expression "^\d+\t.*\d+$" to detect the pattern, and after the pattern is confirmed, my parse data code does the following:

Scanner parser = new Scanner(inputText);
    parser.useDelimiter("\\n");
    while (parser.hasNext()) {
        String nextToken = parser.next();
        String name = nextToken.trim();
        // how do I get the number?

You'll notice the \n delimiter, which parses data at the new line character. This breaks the data up into rows, but does not break each row into two separate data points. I need to grab both the name and the number separately.

I believe I should be using a space delimiter, but I'm not sure if I should be doing this in one or two different steps. The confusion, I believe, stems from my limited understanding of how the Scanner does its work. But after a review of the Java documentation, I'm still not really sure.

Current output:

john 48915

beth 10431

frank 10112

etc.

Expected output:

john

48915

beth

10431

etc.

Should I be doing two different loops of parsing, or can I get the work done in the same pass?

3 Answers3

2

Your problem is that you are usimg \n as delimiter. This leads to the behavior that the input you are passing to your scanner is only delimited at linebreaks and not as you are expecting also at whitespaces.

One solution would that would work is to simply delete the following line: parser.useDelimiter("\\n");


A solution that would also work is the following:

    try (Scanner parser = new Scanner(inputText)) {
        while (parser.hasNextLine()) {
            String nextLine = parser.nextLine();
            String[] strings = nextLine.split("\\s");
            // Here you can use any pattern to split the line
            String name = strings[0];
            String number = strings[1];
            System.out.printf("%s%n%s%n", name, number);
        }
    }

This leads to the following output:

john 48915 beth 10431 frank 10112 anne 34887 taserface 90090 bill 56448

The solution gives you some more controll over the lines and how to parse the name and number.

WUUUGI
  • 316
  • 2
  • 9
1

Here is an example implementation for your case, which offers more control and flexibility to accommodate change of delimiters -

import java.util.Arrays;

public class StringSplitExample {

     public static void main(String []args){
        String content = "john 48915\n"  
                         + "beth 10431\n"
                         + "frank 10112\n"
                         + "anne 34887\n"
                         + "taserface 90090\n"
                         + "bill 56448";

        String[] dataset = content.split("\\n|\\s");

        for (String value : dataset) {
            System.out.println(value);
        }
     }
}

And, following is the output for the above code snippet -

john
48915
beth
10431
frank
10112
anne
34887
taserface
90090
bill
56448
Rahul R.
  • 92
  • 5
0

You can achieve this functionality with String spilt method, below is same program and output as you desire.

I think without space user can not go next line while filling form.

  public class ParseLineText {
    public static void main(String[] args) {
        String textData = "john 48915 " + 
                          "beth 10431 " + 
                          "frank 10112 " + 
                          "anne 34887 " + 
                          "taserface 90090 " + 
                          "bill 56448 ";
        String[] data = textData.split("\\s");
        for (String text : data) {
            System.out.println(text);
        }
    }
}
Output:
john
48915
beth
10431
frank
10112
anne
34887
taserface
90090
bill
56448
Madhusudana
  • 302
  • 3
  • 12