0

Hello fellow coders :) I need help on my code using C# for extracting City Name and UK PostCode those UPPERCASE only from the address string example:

Input

15 Arnott Blk C Quadrant  MOTHERWELL ML1 3TQ North Lanarkshire 
Flat 3 2A Klea Avenue LONDON SW4 9JA London 
12 Parish Close Dawley TELFORD TF4 3ER Shropshire 
76 Admiralty Close  WEST DRAYTON UB7 9NJ West Drayton 
56 Glenburn Avenue  MOTHERWELL ML1 5EF North Lanarkshire 
25 Thirleby Road  EDGWARE HA8 0HF Edgware 
21 Prideaux Place Friars Place Lane LONDON W3 7AS London 
1 Arnold Road  STAINES-UPON-THAMES TW18 1LY Surrey 
6 A Queen Street  BRIDGWATER TA6 3DA Somerset Flat 
8-B Lynn Court Mitcham Lane LONDON SW16 6LL London 
35 Weirside Gardens  WEST DRAYTON UB7 7TL  
473 Rochfords Gardens  SLOUGH SL2 5XF Berkshire 
155 Strawberry Fields  ADDLESTONE KT15 1FJ Surrey

Output

MOTHERWELL ML1 3TQ
LONDON SW4 9JA
TELEFORD TF4 3ER
WEST DRAYTON UB79NJ
MOTHERWELL ML1 5EF
EDGWARE HA8 0HF
LONDON W3 7AS
STAINE-UPON-THAMES TA6 3DA
LONDON SW16 6LL
WEST DRAYTON UB7 7TL
SLOUGH SL2 5XF
ADDLESTONE KT15 1FJ

They are fix UPPERCASE which I need to extract. Any help or tips are greatly appreciated thanks.

gdmplt
  • 33
  • 1
  • 4
  • Is this data in columns? i.e. are those tabs between the street and city? – Kevin Burdett Feb 11 '16 at 05:30
  • Please be more specific about why those columns are chosen. Is it as simple as taking the fourth, fifth and sixth item when splitting by spaces? If so, it's nothing to do with uppercase. If it's not, then can you explain why `15`, `C` and `8-B` are not included in the output? – Rob Feb 11 '16 at 05:39
  • Check this: [Regex to restrict UPPERCASE only](http://stackoverflow.com/questions/17861316/regex-to-restrict-uppercase-only) as you are only looking for uppercase letters. – Abdulkarim Kanaan Feb 11 '16 at 05:44
  • this is in one column only, each line is a string of address only. What I only need is the City and Postcode like for example from "8-B Lynn Court Mitcham Lane LONDON SW16 6LL London" string I need to extract "LONDON SW16 6LL" the upper case City and Postcode only. @ Rob 15, C and 8-B are not needed as they are only house and street number only. – gdmplt Feb 11 '16 at 05:59

2 Answers2

3
 using System.Text.RegularExpressions;

and

string FindCityAndCode(string input)
{
   Regex regCode = new RegEx("([-A-Z ]{2,} [A-Z0-9]{2,} [A-Z0-9]{2,}) ");

   var m = regCode.Match(input);

   if (m.Success)
       return m.Groups[1].Value;

   return string.Empty;
}

The regular expression is designed to match a series of uppercase A-Z and spaces, followed by a space, then at least two characters A-Z or 0-9 another space and then another two or more characters A-Z or 0-9 followed by another space.

If that expression is matched, we will return it, otherwise we return an empty string ...

I am not too familiar with UK postcodes, so the {2,} may need to be further qualified to maybe {2,3} to say between 2 or 3 characters ...

Jens Meinecke
  • 2,904
  • 17
  • 20
0

Hi @Ubercoder Ive tried your code though some Postcode are not validated hence not producing any output or incomplete I have my UKPost RegEx validation code if you could incorporate this on you code it might help greatly on extracting the uppercase city and postcode

public static bool IsPostCode(string postcode)
    {
    return (
            Regex.IsMatch(postcode, "(^[A-PR-UWYZa-pr-uwyz][0-9][ ]*[0-9][ABD-HJLNP-UW-Zabd-hjlnp-uw-z]{2}$)") ||
            Regex.IsMatch(postcode, "(^[A-PR-UWYZa-pr-uwyz][0-9][0-9][ ]*[0-9][ABD-HJLNP-UW-Zabd-hjlnp-uw-z]{2}$)") ||
            Regex.IsMatch(postcode, "(^[A-PR-UWYZa-pr-uwyz][A-HK-Ya-hk-y][0-9][ ]*[0-9][ABD-HJLNP-UW-Zabd-hjlnp-uw-z]{2}$)") ||
            Regex.IsMatch(postcode, "(^[A-PR-UWYZa-pr-uwyz][A-HK-Ya-hk-y][0-9][0-9][ ]*[0-9][ABD-HJLNP-UW-Zabd-hjlnp-uw-z]{2}$)") ||
            Regex.IsMatch(postcode, "(^[A-PR-UWYZa-pr-uwyz][0-9][A-HJKS-UWa-hjks-uw][ ]*[0-9][ABD-HJLNP-UW-Zabd-hjlnp-uw-z]{2}$)") ||
            Regex.IsMatch(postcode, "(^[A-PR-UWYZa-pr-uwyz][A-HK-Ya-hk-y][0-9][A-Za-z][ ]*[0-9][ABD-HJLNP-UW-Zabd-hjlnp-uw-z]{2}$)") ||
            Regex.IsMatch(postcode, "(^[Gg][Ii][Rr][]*0[Aa][Aa]$)")
            );
    } 
gdmplt
  • 33
  • 1
  • 4