1

I have the following regex but it is not satisfying my requirements.

"(?i)\b(?:p(?:ost)?\.?\s*[o0](?:ffice)?\.?\s*b(?:[o0]x)?|b[o0]x)"
123 post office 
123 post office box 
post office
po box
po 12 box
35 po box
PO.Box
p.o.box 

Above examples are failing with my current regex

Alan Moore
  • 73,866
  • 12
  • 100
  • 156
  • Possible duplicate http://stackoverflow.com/questions/5680050/po-box-regular-expression-validation – Ekk Nov 08 '11 at 06:38

3 Answers3

2

For PO boxes, you'll discover that it's not possible to cover all the cases. It's obviously something you probably don't want to hear, but thems is the breaks. It's very evident when one starts Googling around for the solution, because trust me I looked into this, there are many solutions. All of the solutions I've seen, I don't much care for.

So you have to go back to the rules/standards of what a PO box address comprises. One can find that out on Wikipedia. It's of the format PO Box, P.O. Box, Postal Office Box, P Office Box, Postal Box, Post Box, just to name some examples of the standard format that one bases his/her rules on when writing the regex to determine if the address is a PO box or not.

With that said, here is my solution. It's simple, because it has to be, there are too many dumb ways users will put in po box. And you have to assume p or postal or something p starts the address for a po box. That way you don't filter addresses that have some variant of the word p or o in a 123 po address format. I hope this makes sense.

/^p+(ostal|ost|\.| )*o*(ffice|\.| )*(box)*/i

The above can be tested on http://www.rubular.com, you'll need to remove the beginning and closing forward slash, and put the i (case insensitivity in the block the text field to the right of the closing forward slash.

Steve Nguyen
  • 5,854
  • 5
  • 21
  • 39
1

If you know the input is a post office box, try filtering out any text first and just using the number from it. Or, just lowercase the string and then strip the letters p,o,s,t,f,i,c,e,b,x and then if there's any letters left over, it's no good.

Tim
  • 14,447
  • 6
  • 40
  • 63
0

Not sure how accurate would it be to filter "post office", without a "box" following it. Below regex satisfies all conditions except "123 post office" and "post office" in your list.

@"\bp*[o0]*(st)*(al)*\.*\s*[o0]*(ffice)*\.*\s*b+[o0]?x+\b"

And if you change "b+[o0]?x+" to "(b+[o0]?x+)*", it would filter those too.

Hope this helps.

Seyeong Jeong
  • 10,658
  • 2
  • 28
  • 39