-1

I have a list where i want to remove all first articles: ["the house", "the beautiful garten", "the beautiful garten of the house"]

and i want the list to only contain: ["house", "beautiful garten", "beautiful garten of the house"]

If the first word is an article, then remove. In case the articles appears in the sentence, they should be kept.

Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
  • Is it always just `the ` you want to remove, or is there a list of articles (e.g. `['the', 'an', 'his']`) – match Nov 02 '22 at 11:10

3 Answers3

1

If you're working with python try

new_list = [s[4:] if s.startswith('the ') else s for s in old_list]
frfritz
  • 41
  • 1
  • 7
1

Another option is using re (this example will also ignore whitespaces around the first the:

import re

lst = ["the house", "the beautiful garten", "the beautiful garten of the house"]

pat = re.compile(r"^\s*the\s+", flags=re.I)
out = [pat.sub("", w) for w in lst]

print(out)

Prints:

['house', 'beautiful garten', 'beautiful garten of the house']
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
0

If you have a list of potential articles or words to be trimmed, you can do the following to remove that word plus the following space:

articles = ["the", "a", "an"]

sentences = ["the house", "a beautiful garten", "an amazing garten of the house"]


out = []
for s in sentences:
    new = s
    for a in articles:
      if s.startswith(f'{a} '):
        new = s[len(a)+1:]
    out.append(new)

print(out)

['house', 'beautiful garten', 'amazing garten of the house']
match
  • 10,388
  • 3
  • 23
  • 41