0

My code is able to show each word that starts with a specific letter from a text file, but I want it to not show duplicate words. Here is my code:

with open('text.txt','r') as myFile:
    data=myFile.read().lower()

for s in data.split():
    if s.startswith("r"):
        print(s)

Like I said, my code does print the words but it shows duplicates. Thank you for helping

gbenyu
  • 47
  • 1
  • 1
  • 7

2 Answers2

2

Use a set:

with open('text.txt', 'r') as myFile:
    data = myFile.read().lower()

    seen = set()
    for s in data.split():
        if s not in seen and s.startswith("r"):
            seen.add(s)
            print(s)
Dani Mesejo
  • 61,499
  • 6
  • 49
  • 76
0

This is an optimized version the will read the file line by line without loading it all into memory:

seen_words = set()

with open('text.txt', 'r') as my_file:
    for line in my_file:
        for word in line.lower().split():
            if word not in seen_words:
                print(word)
                seen_words.add(word)
Dalvenjia
  • 1,953
  • 1
  • 12
  • 16