#pseudo, actual list file could be found in the link below
data_words = [[word_11, word_12,...,word_1n],
[word_21],word_22,...,word_2m],
...
]
import pickle
with open('data_words.pk', 'rb') as f:
data_words = pickle.load(f)
word_list = np.concatenate(data_words)
>>MemoryError on Ubuntu 18.04 LTS 64 bit with 16G memory, Virtual Machine(VMware)
>>No problem on MacOS 10.12.6 with 8G memory
#And the code below works fine for both Mac and Ubuntu
import itertools
word_list_flat = list(itertools.chain.from_iterable(data_words))
Since data_words
list is less than 400Mb, so maybe there is something wrong with numpy.concatenate()
in Ubuntu 18? I'd really like to know why this kind of behavior happens, although I get desirable results by using
CTT's code
Here is the data_words list I used to produce the Error