4

I need to build one RegEx to remove leading "The" or "A" or "An" and "spaces" from a given string.

For example, the given string is:

The quick brown fox jumps over the lazy dog

With Regex I want the leading "The" to be removed and return just

quick brown fox jumps over the lazy dog

I tried (added from a comment)

^*(?<=[The|An|A]\s){1}.*

It is working fine but in one scenario it is not returning expected result. Please see the scenarios below.

Input: The quick brown fox --> Result = quick brown fox

Input: A quick brown fox --> Result = quick brown fox

Input: In A sunny day --> Result = A sunny day (expected is In a sunny day.. as the string is not starting with A)

Input: American An bank --> Result = An bank (expected is American An bank.. as the string is not starting with An)

stema
  • 90,351
  • 20
  • 107
  • 135
Kiran
  • 95
  • 1
  • 1
  • 7

2 Answers2

10

What have you tried by yourself? What you want to achieve is not difficult, try e.g. this tutorial on Regular-Expresions.info.

You are thinking much to complicated. Try this:

^(The|An|A)\s+

and replace with the empty string.

See it here on Regexr

^ matches the start of the string.

(The|An|A) An alternation, matches the first fitting alternative.

\s+ matches at least one following whitespace.

Changes

The quick brown fox

A quick brown fox

In A sunny day

American An bank

To

quick brown fox

quick brown fox

In A sunny day

American An bank

stema
  • 90,351
  • 20
  • 107
  • 135
  • I tried with this regex. ^*(?<=[The|An|A]\s){1}.* It is working fine but in one scenario it is not returning expected result. Please see the scenarios below. **Input:** The quick brown fox --> Result = quick brown fox **Input:** A quick brown fox --> Result = quick brown fox **Input:** In A sunny day --> Result = A sunny day (expected is In a sunny day.. as the string is not starting with A) **Input:** American An bank --> Result = An bank (expected is American An bank.. as the string is not starting with An). – Kiran Jan 24 '13 at 08:09
0

Below is the complete one-line in perl:

perl -e 'my $a = "The quick brown fox jumps over the lazy dog"; $a =~ s/^\s*(?:The|An|A)\s+//gi; print $a;'

The part that does the replace is:

$a =~ s/^\s*(?:The|An|A)\s+//gi;

The regex that matches your words and spaces is /^\s*(?:The|An|A)\s+/

Tudor Constantin
  • 26,330
  • 7
  • 49
  • 72
  • 1
    This regex is retrieving those characters from given string. But i want the string without them. – Kiran Jan 24 '13 at 07:54
  • it is not retrieving them ( because of the *non-capturing* `(?:..)` construct ). It is replacing the matching part with empty string. Depending on the language you use, the replace might have a different syntax – Tudor Constantin Jan 24 '13 at 08:12