0

I have a first .txt file that contains 5 words each on a line, and another one that contains 100 keywords (each on a line too). I want to print for each word, the whole list of terms. Here's what I did :

words = open("./sample_5.txt","r", encoding='utf8')
termes = open("./100_keywords.txt", "r", encoding='utf8')
for w in words:
    for t in termes:
        print (w,t)

Trouble is, this does not iterate on w, which means it returns to me the first word with the 100keyword and that's it. I should have a matrice of (5,100) and i get (1,100). Any help?

user93804
  • 83
  • 8
  • 1
    I'm having a hard time understanding the problem. Can you give a sample of the actual and expected output for certain inputs? Maybe on smaller files, like if the first file has 2 words and the second file has 3. – Brian McCutchon Jun 20 '20 at 21:12

3 Answers3

2

I think this would help.

Here we read the files specified as array of lines (we used .readlines() since the items are each on a separate line). then do a cartesian product between these lines (equivalent to writing nested loop). then just print them.

Explanation:

when we deal with files (use open) python internally creates a stream (TextIOBase) and every time we try read from the buffer, the next call returns from where left off. So unless you close/open the file inside the second loop, or seek to read from beginning, you wont get the already read strings back. In the solution I gave, we only read the files at the beginning once.

from itertools import product

words = open("./a.txt","r", encoding='utf8').readlines()
termes = open("./b.txt", "r", encoding='utf8').readlines()

for word, term in product(words, termes):
    print(word.strip(), term.strip())
Tibebes. M
  • 6,940
  • 5
  • 15
  • 36
1

EDITED per @Brian McCutchon's comment

Since you want to iterate through the second file multiple times,
you want to use a static container like a list,
otherwise, you can only iterate it once:

words = open("./sample_5.txt","r", encoding='utf8')
termes = open("./100_keywords.txt", "r", encoding='utf8').read().splitlines()
for w in words:
    for t in termes:
        print (w,t)
ywbaek
  • 2,971
  • 3
  • 9
  • 28
  • @BrianMcCutchon in the nested for loop: For every w in `words` OP is iterating through the `terms`. So OP is trying to iterate though the second file object, `terms` 5 times. – ywbaek Jun 20 '20 at 21:17
  • You are right, I edited the answer. – ywbaek Jun 20 '20 at 21:19
0

Here is what you can do:

with open("./sample_5.txt","r", encoding='utf8') as words, open("./100_keywords.txt", "r", encoding='utf8') as termes:
        a = termes.readlines()
        for w in words:
            for t in a:
                print (w,t.replace('\n',''))
Red
  • 26,798
  • 7
  • 36
  • 58