1

Let's say I have this string: 1111 Butterfly Ct City, CA 00000

I want to omit 1111 Butterfly Ct (including the whitespace after) and just match the City and onwards.

My logic goes that if I match with ^.+?Ct\s, it matches everything until City. I want to do the opposite of that. Match everything after that. Doing [^.+?Ct\s], it just shows me matches of the individual characters of the street address minus the "Ct" part.

dearprudence
  • 183
  • 1
  • 8
  • 1
    You can use `\b[A-Za-z]+,.*` if the city has one name, but if it could be, `New York`, for example, you have a problem (not just a regex problem) determining where the street ends and the city begins. – Cary Swoveland Mar 16 '20 at 03:45
  • 1
    An actual address: "26 Central Street West Springfield, MA 01089". (It's the city of "West Springfield", not the street "Central Street West" in "Springfield". – Cary Swoveland Mar 16 '20 at 03:58

1 Answers1

1

You want a zero width assertion, such as this:

(?<=Ct\s).*

See an example at https://regex101.com/r/lDUuiX/1

mankowitz
  • 1,864
  • 1
  • 14
  • 32
  • Thank you! Your answer led me to learn more about lookarounds on this post: https://stackoverflow.com/questions/2973436/regex-lookahead-lookbehind-and-atomic-groups – dearprudence Mar 16 '20 at 05:33
  • Prudence, you are only interested in street names ending "Ct"? – Cary Swoveland Mar 16 '20 at 05:36
  • @CarySwoveland I'm going to have a list of these and use the "|" to check "Ct|Dr|" etc etc. – dearprudence Mar 16 '20 at 17:02
  • Prudence, you should state that in your question. I suggest you not write "etc. etc.", but instead give a specific list of three or four endings that you describe as an example, so that readers can use that list in their answers. [These](https://en.wikipedia.org/wiki/Street_suffix) lists of street name suffixes may be of interest. – Cary Swoveland Mar 16 '20 at 17:16
  • @CarySwoveland I appreciate the link. The question just asked how to ignore up until a certain part and kept it on topic without focusing on the suffixes. Lookarounds answered that and I replied with a link for any reader to learn more about Lookarounds. I merely responded to your curiosity. – dearprudence Mar 16 '20 at 19:41
  • Prudence just a note that some regex engines (not PCRE or Python) will permit `(?<=ab|cd)` but not `(?<=ab|cde)` (because all elements in the alternation are not the same length) and some engines don't permit lookbehinds at all. I mention this because you did not specify the regex engine. – Cary Swoveland Mar 16 '20 at 20:09