1

I'm wondering how I can turn my UK postcode validator into a US postcode validator. Currently my program reads postcodes from a text file and validates whether they are valid UK postcodes. This works well but I would like too read in US postcodes instead of UK postcodes and then validate them. Below is my current program.

package postcodesort;

import java.util.*;
import java.util.Queue;
import java.util.TreeSet;
import java.io.File;
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.LinkedList;
import java.util.StringTokenizer;
import java.util.regex.Matcher;
import java.util.regex.Pattern;




public class PostCodeSort 
{
Queue<String> postcodeStack = new LinkedList<String>();

public static void main(String[] args) throws IOException 
{
    FileReader fileReader = null;
    ZipCodeValidator zipCodeValidator = new ZipCodeValidator();

    // Create the FileReader object
    try {
        fileReader = new FileReader("usvalidcodes.txt");
        BufferedReader br = new BufferedReader(fileReader);

        String str;
        while((str = br.readLine()) != null) 
        {
            if(zipCodeValidator.isValid(str)){
                System.out.println(str + " is valid");
            }
            else{
                System.out.println(str + " is not valid");
            }
        }
    }

    catch (IOException ex) 
    {
        // handle exception;
    }

    finally 
    {
        fileReader.close();
    }

}
}

And here is the part of code which does the validating via a regex.

package postcodesort;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
*
* @author ec1312017
*/
public class ZipCodeValidator {
private static String regex = "^[A-Z]{1,2}[0-9R][0-9A-Z]? [0-9][ABD-HJLNP-UW-Z]{2}$";
private static Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);

public boolean isValid(String zipCode) {
    Matcher matcher = pattern.matcher(zipCode);
    return matcher.matches();
}
}

I have also included a small selection of the data within the text file to be read in.

"01","35005","AL","ADAMSVILLE",86.959727,33.588437,10616,0.002627

"05","72001","AR","ADONA",92.903325,35.046956,494,0.00021

"06","90804","CA","SIGNAL HILL",118.155187,33.782993,36092,0.001213

Any help is appreciated and please feel free to ask any questions.

danbb
  • 11
  • 2
  • You'll need to amend the regular expression check to suit the US postal format rather than the UK one. If you post the US format with a few examples, it'll be easier to help you find the right regex to handle it. – Dave Jun 06 '16 at 17:15
  • Yes, as Dave says, we need examples. If the ZIP code is only 5 numbers, it is different than if it also includes the extension thing. – Laurel Jun 06 '16 at 17:16
  • I've posted some examples of what I'll be reading in. So I want it to read that it has 5 numbers and then it reads the first two numbers. So like this "01","35006","AL". That's all I want read and validated as there are 10,000 entries. – danbb Jun 06 '16 at 17:21

2 Answers2

0

A simple solution is to just change the regex. This question regex for zip-code suggests the regex should be ^\d{5}(?:[-\s]\d{4})?$

So the class looks like this:

public class ZipCodeValidator {

private static String regex = "^\d{5}(?:[-\s]\d{4})?$";
private static Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);


public boolean isValid(String zipCode) {
    Matcher matcher = pattern.matcher(zipCode);
    return matcher.matches();
}
}

This wouldn't validate against the list of zipcoes you provided but it's a lot simpler to implement :)

Community
  • 1
  • 1
Matthew
  • 10,361
  • 5
  • 42
  • 54
  • Would the regex work for this form of data? "01","35005","AL","ADAMSVILLE",86.959727,33.588437,10616,0.002627. – danbb Jun 06 '16 at 17:24
  • Ultimately I want the program to read the entry and validate the postcode if it has 5 numbers. In this case it would be "35005"? After it has been validated I would then want to assign that entry to a specific state which will be taken from the first two numbers. In that case "01". But that's for another day :) – danbb Jun 06 '16 at 17:27
  • How could I validate using my list of zip codes as I don't want to remove 10,000 pieces of data from my list of postcodes. – danbb Jun 06 '16 at 17:33
0

Usually, I would suggest you use a well-established address validating service because parsing and matching addresses is pretty complicated: there are lots of little ways to mess up big.

But, you're use case is really specific: you have lines of data that are always the same and you only want a validator that checks if there is a 5-digit postal code in a specific spot of the line.

So, here I've made an edit to your regex validator code so that it matches the data you say you're getting:

public class ZipCodeValidator {
    private static String regex = "^"\d{2}","\d{5}","\w{2}","[\w ]+",\d*\.?\d*,\d*\.?\d*,\d*,\d*\.?\d*$";
    private static Pattern pattern = Pattern.compile(regex, 
    Pattern.CASE_INSENSITIVE);

    public boolean isValid(String entireLineFromFile) {
        Matcher matcher = pattern.matcher(entireLineFromFile);
        return matcher.matches();
    }
}

You can play with the regex on this example on regexr.com. If you hover over characters in the regular expression, a popup will tell you what each part means.

(Full disclosure: I worked for SmartyStreets, an address validator and autocompletion company.)

Joseph Hansen
  • 12,665
  • 8
  • 50
  • 68