0

I have these arrays in python:

noDuplicateArr = ['"foo barr', '"foo corp', '"barr corp']
wordsArr = ['"fool barr', '"fool corp"']

Now what's the best approach to not append in noDuplicateArr the words "fool barr" & "fool corp" because "barr" and "corp" are already present in noDuplicateArr?

martineau
  • 119,623
  • 25
  • 170
  • 301
  • `set(noDuplicateArr + wordsArr)` – jordanm Feb 20 '20 at 16:49
  • This page has all you need https://stackoverflow.com/questions/7961363/removing-duplicates-in-lists – MT756 Feb 20 '20 at 16:49
  • @jordanm `list(set(noDuplicateArr + wordsArr))` to make a list – CC7052 Feb 20 '20 at 16:51
  • This should not have been closed. He isn't looking to remove duplicates in a list. He wants to avoid appending new strings to a list of strings of the encountered words (substrings) already exist in the list. – Ben Feb 20 '20 at 16:58

2 Answers2

1

To better phrase this, you want to prevent appending a string to a list of strings if it contains a word/substring that already exists in it. You'll need to use a set to keep track of words that have already been added.

noDuplicateArr = ['"foo barr', '"foo corp', '"barr corp']
wordsArr = ['"fool barr', '"fool corp"']

seen_words = set()
for words in noDuplicateArr:
  words = words.strip('"')
  seen_words |= set(words.split())

for words in wordsArr:
  seen = False
  words = words.strip('"')
  for word in words.split():
    if word in seen_words:
      seen = True
      continue
  if not seen:
    noDuplicateArr.append(words)
Ben
  • 2,422
  • 2
  • 16
  • 23
  • Thanks for the answer however I had this use case, I got this ``` noDuplicateArr = ['foo', 'poo'] wordsArr = ['foo bar'] -> should not be appended noDuplicateArr = ['bar foo', 'poo'] wordsArr = ['bar'] -> this should replace all words with bar ``` – Benjie Perez Feb 21 '20 at 07:56
0
list(set(noDuplicateArr.extend(wordsArr)))

This will give you an array with unique entries only.

jcf
  • 602
  • 1
  • 6
  • 26