1

I'm trying to make a program that filters IMDb Top 250 TV shows. The idea is that I have a list already filled with dictionaries containing different attributes of each show (title, year, rating, etc.) and with each filter I remove show dictionaries that don't fit the criteria. I'm having trouble with my cast filter. I'm trying to use a for-loop to check each show dictionary in the list in order to decide whether to remove it, but the loop appears to skip a lot of shows, from one or two to chunks at a time.

recs starts as an empty list, but we fill it with the shows we get from BeautifulSoup

recs=[]
from bs4 import BeautifulSoup
import requests
import re


# DOWNLOADING TOP TV SHOW DATA
url = 'http://www.imdb.com/chart/toptv'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'lxml')

shows = soup.select('td.titleColumn')
links = [a.attrs.get('href') for a in soup.select('td.titleColumn a')]
crew = [a.attrs.get('title') for a in soup.select('td.titleColumn a')]

ratings = [b.attrs.get('data-value')
        for b in soup.select('td.posterColumn span[name=ir]')]


#dictionaries for each show attribute
show_year={}
show_cast={}
show_rating={}

# eeach movie is a dictionary of its details
for index in range(0, len(shows)):
    
    
    movie_string = shows[index].get_text()
    movie = (' '.join(movie_string.split()).replace('.', ''))
    movie_title = movie[len(str(index))+1:-7]
    
    show_year[movie_title] = int(re.search('\((.*?)\)', movie_string).group(1)) #YEAR
    show_cast[movie_title] = crew[index] #CAST
    show_rating[movie_title] =float(ratings[index][:3]) #RATING
    
    recs.append({"title":movie_title,"year":show_year[movie_title],"cast":show_cast[movie_title],"rating":show_rating[movie_title]})

This is just a method that makes recs readable

def printList(d):
    print("RECOMMENDATIONS-----------------------------------------------------------")
    for i in d:
        print(i)
    print("--------------------------------------------------------------------------")

This is where I'm running into trouble. I'm not sure what the problem is because I think that the for-loop looks straightforward enough.

searchCast = input("Who was in your show? If you don't know, press Enter ") #might use .lower() #make lower for ease of searching
searchCast = searchCast.lower() #get rid of case sensitivity

if searchCast:

    for show in recs:
        checkCast = show["cast"].lower() #the cast names from each show already in recs
        print("checking if",searchCast,"is in ",checkCast)

        if searchCast not in checkCast:
            print("removing:",show["title"])
            recs.remove(show)

        else:
            print("confirming that ",searchCast,"is in ",checkCast)

    printList(recs)
mai-
  • 11
  • 1
  • See the linked post for how to do this properly. The reason however is `remove`ing from a list while iterating it messes with the iterator. You should instead create a new list that skips the elements you don't want. Does anyone know of a better duplicate post than that? It doesn't address these questions as well, but gives the correct way. – Carcigenicate Apr 27 '22 at 23:15

0 Answers0