How to parse a line of text with two separate pieces of data?

Question

I'm new to parsing (and new to Java), so I just want to be sure I'm conceptualizing this correctly.

I've written a program that allows the user to paste text into a text field, then click the "parse data" button. This button parses the text for two pieces of data: a name and a number.

The text data is generally pasted by the user in this form:

john 48915
beth 10431
frank 10112
anne 34887
taserface 90090
bill 56448

I'm using the regular expression "^\d+\t.*\d+$" to detect the pattern, and after the pattern is confirmed, my parse data code does the following:

Scanner parser = new Scanner(inputText);
    parser.useDelimiter("\\n");
    while (parser.hasNext()) {
        String nextToken = parser.next();
        String name = nextToken.trim();
        // how do I get the number?

You'll notice the \n delimiter, which parses data at the new line character. This breaks the data up into rows, but does not break each row into two separate data points. I need to grab both the name and the number separately.

I believe I should be using a space delimiter, but I'm not sure if I should be doing this in one or two different steps. The confusion, I believe, stems from my limited understanding of how the Scanner does its work. But after a review of the Java documentation, I'm still not really sure.

Current output:

john 48915

beth 10431

frank 10112

etc.

Expected output:

john

48915

beth

10431

etc.

Should I be doing two different loops of parsing, or can I get the work done in the same pass?

Why not just use the `next()` method that is already delimited by whitespace? — GBlodgett, Mar 28 '19 at 17:20
I'd get rid of `parser.useDelimiter("\\n");`. You're shooting yourself in the foot with this. — Hovercraft Full Of Eels, Mar 28 '19 at 17:21
Forgive my ignorance, is a newline considered whitespace by the next() method? — Umbrella_Programmer, Mar 28 '19 at 17:25
whitespace char is whitespace char. I am surprised you didnt check that at the beginning :) — Antoniossss, Mar 28 '19 at 17:29
There are so many ways to skin this cat, and you should experiment using several of them, including reading in each line via `.nextLine()` and then splitting the line, vs using nested Scanner objects, one to read each line from the file and the other to parse each line obtained, .... — Hovercraft Full Of Eels, Mar 28 '19 at 17:33
... oh and also you could [parse the file using the Java 8 streams API](https://stackoverflow.com/questions/34884439/how-to-effectively-parse-text-files-with-java-stream-api). — Hovercraft Full Of Eels, Mar 28 '19 at 17:37

score 2 · Accepted Answer · answered Mar 28 '19 at 18:02

Your problem is that you are usimg \n as delimiter. This leads to the behavior that the input you are passing to your scanner is only delimited at linebreaks and not as you are expecting also at whitespaces.

One solution would that would work is to simply delete the following line: parser.useDelimiter("\\n");

A solution that would also work is the following:

    try (Scanner parser = new Scanner(inputText)) {
        while (parser.hasNextLine()) {
            String nextLine = parser.nextLine();
            String[] strings = nextLine.split("\\s");
            // Here you can use any pattern to split the line
            String name = strings[0];
            String number = strings[1];
            System.out.printf("%s%n%s%n", name, number);
        }
    }

This leads to the following output:

john 48915 beth 10431 frank 10112 anne 34887 taserface 90090 bill 56448

The solution gives you some more controll over the lines and how to parse the name and number.

This was the exact conclusion I came to after a while of tinkering. Thanks very much for confirming and for taking the time — Umbrella_Programmer, Mar 28 '19 at 18:52

Rahul R. · Answer 2 · 2019-03-28T17:50:38.553

Here is an example implementation for your case, which offers more control and flexibility to accommodate change of delimiters -

import java.util.Arrays;

public class StringSplitExample {

     public static void main(String []args){
        String content = "john 48915\n"  
                         + "beth 10431\n"
                         + "frank 10112\n"
                         + "anne 34887\n"
                         + "taserface 90090\n"
                         + "bill 56448";

        String[] dataset = content.split("\\n|\\s");

        for (String value : dataset) {
            System.out.println(value);
        }
     }
}

And, following is the output for the above code snippet -

john
48915
beth
10431
frank
10112
anne
34887
taserface
90090
bill
56448

Madhusudana · Answer 3 · 2019-03-28T18:28:13.243

You can achieve this functionality with String spilt method, below is same program and output as you desire.

I think without space user can not go next line while filling form.

  public class ParseLineText {
    public static void main(String[] args) {
        String textData = "john 48915 " + 
                          "beth 10431 " + 
                          "frank 10112 " + 
                          "anne 34887 " + 
                          "taserface 90090 " + 
                          "bill 56448 ";
        String[] data = textData.split("\\s");
        for (String text : data) {
            System.out.println(text);
        }
    }
}
Output:
john
48915
beth
10431
frank
10112
anne
34887
taserface
90090
bill
56448

How to parse a line of text with two separate pieces of data?

3 Answers3