0

I have a text file. Each line in the file represents a record having 'n' number of columns delimited by a | (pipe) character. The column-values are of type int, string, date, timestamp, etc. Empty string and spaces are also possible as column values.

I am validating only the count of the column-values and validation of data type is not required.

Sample valid records of 5 columns each:

1234|xyz|abc|2016-04-08 11:12:40|234
1235|efgh|abc|2016-04-09 11:25:40|
1236|efghij| ||

Validation code:

boolean valid = true;
String line = buffReader.readLine();
String[] tokens = null;
while (line != null){
    tokens = line.split("\\|");
    if ((tokens.length==4 || tokens.length==5) && countPipes(line)==4){

    } else {
        valid = false;
        break;
    }
    line = buffReader.readLine();
}

private int countPipes(String line){
    int count = 0;
    count = line.length() - line.replace("|", "").length();
    return count;
}

I feel that the code can be better. Can someone let know how i can improve this code?

Marco99
  • 1,639
  • 1
  • 19
  • 32
  • Are you allowing for escaped `|` in the column values? If yes, your validation needs to handle this. – dpr Feb 22 '17 at 08:50
  • @RealSkeptic Thanks for pointing out the bug. I have corrected it. – Marco99 Feb 22 '17 at 09:09
  • @dpr The column values will not have "|" for sure. – Marco99 Feb 22 '17 at 09:11
  • So now you have a `countPipes` method? You should [edit] the question and add that. If it counts pipes, there is no reason to actually split the line at all. – RealSkeptic Feb 22 '17 at 09:11
  • @RealSkeptic Added the countPipes method too. – Marco99 Feb 22 '17 at 09:19
  • @RealSkeptic Added the logic. Courtesy: http://stackoverflow.com/questions/275944/java-how-do-i-count-the-number-of-occurrences-of-a-char-in-a-string Thanks! – Marco99 Feb 22 '17 at 09:35

2 Answers2

1

Well, you can simply check that there are four pipes in the line. If there are exactly four pipes, then there are five columns, which may be empty (which you allow).

while (line != null) {
    if ( countPipes(line) != 4 ) {
        valid = false;
        break;
    }
    line = buffReader.readLine();
}

Now you don't need to split the line at all.

A note about splitting, though. If you use the split with two parameters, and use a negative number, the split will contain entries for the empty elements as well. Here is a demonstration:

public class Test {

    public static void main(String[] args) throws IOException {
        String line = "A|B|||";

        String[] zeroSplit = line.split("\\|");
        String[] negativeSplit = line.split("\\|",-1);

        System.out.println( "When split without parameter: " + zeroSplit.length );
        System.out.println( "When split with negative parameter: " + negativeSplit.length );
    }
}

The output here is:

When split without parameter: 2
When split with negative parameter: 5

So in this case, you can check that your split is exactly of length 5, and get the same result.

while (line != null) {
    if ( line.split("\\|",-1).length != 5 ) {
        valid = false;
        break;
    }
    line = buffReader.readLine();
}
RealSkeptic
  • 33,993
  • 7
  • 53
  • 79
0

First of all and most important: You got a serious error in your if statement, as you are using a single = (assignment operator) instead of the comparison operator ==!!

Regarding code cleanup: It's only a slight adjustment and there are probably better ways to validate this but it's the first thing that came into my mind:

boolean valid = true;
String line = buffReader.readLine();
while (valid && (line != null)){
    String[] tokens = line.split("\\|");
    valid = !(tokens.length == 4 || tokens.length == 5);
    line = buffReader.readLine();
}
dpr
  • 10,591
  • 3
  • 41
  • 71