1

I'm trying to build a python script, that recursively reads text files from directory, and saves all the words, from all the files, to an array (lets call it array-A).

I have another array, that have a list of pre-defined words (lets call it array-B)., e.g.:

['hello', 'cat', 'dog', 'mouse',...]

What I want to do, is for each word in the array-A, to check if its in array-B, and if not, add it.

I did that script, but it takes long time for big arrays (for many words), as its O(2^n) - for each word in array-A, check if in its array-B.

Before implementing adding words in lexicographic order (to allow quick search algo), and searching words using quick search, I'm wondering if there is already python class that does that.

asmd ashdkj
  • 113
  • 1
  • 3
  • 8

2 Answers2

1

Just use a dict (like {'hello':1, 'cat':1, 'dog':1, 'mouse':1, ...}), it's amortized O(1) per word to check.

Kit.
  • 2,386
  • 1
  • 12
  • 14
  • and if you want the list, at the end you can generate it with .keys() – adm_ Apr 26 '20 at 08:45
  • But I still need to search if that word is in the dictionary, don't i? if I run "if word in dictionary", isn't it like for looping through an array? – asmd ashdkj Apr 26 '20 at 08:48
1

if you want a final array with only one occurrence of every word from both arrays, try this:

new_arr = list(set(arrA + arrB))  # + adds both arrays, set deletes more than one occurrence
Vijay
  • 65
  • 10