I'm trying to make a program that filters IMDb Top 250 TV shows. The idea is that I have a list already filled with dictionaries containing different attributes of each show (title, year, rating, etc.) and with each filter I remove show dictionaries that don't fit the criteria. I'm having trouble with my cast filter. I'm trying to use a for-loop to check each show dictionary in the list in order to decide whether to remove it, but the loop appears to skip a lot of shows, from one or two to chunks at a time.
recs starts as an empty list, but we fill it with the shows we get from BeautifulSoup
recs=[]
from bs4 import BeautifulSoup
import requests
import re
# DOWNLOADING TOP TV SHOW DATA
url = 'http://www.imdb.com/chart/toptv'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'lxml')
shows = soup.select('td.titleColumn')
links = [a.attrs.get('href') for a in soup.select('td.titleColumn a')]
crew = [a.attrs.get('title') for a in soup.select('td.titleColumn a')]
ratings = [b.attrs.get('data-value')
for b in soup.select('td.posterColumn span[name=ir]')]
#dictionaries for each show attribute
show_year={}
show_cast={}
show_rating={}
# eeach movie is a dictionary of its details
for index in range(0, len(shows)):
movie_string = shows[index].get_text()
movie = (' '.join(movie_string.split()).replace('.', ''))
movie_title = movie[len(str(index))+1:-7]
show_year[movie_title] = int(re.search('\((.*?)\)', movie_string).group(1)) #YEAR
show_cast[movie_title] = crew[index] #CAST
show_rating[movie_title] =float(ratings[index][:3]) #RATING
recs.append({"title":movie_title,"year":show_year[movie_title],"cast":show_cast[movie_title],"rating":show_rating[movie_title]})
This is just a method that makes recs readable
def printList(d):
print("RECOMMENDATIONS-----------------------------------------------------------")
for i in d:
print(i)
print("--------------------------------------------------------------------------")
This is where I'm running into trouble. I'm not sure what the problem is because I think that the for-loop looks straightforward enough.
searchCast = input("Who was in your show? If you don't know, press Enter ") #might use .lower() #make lower for ease of searching
searchCast = searchCast.lower() #get rid of case sensitivity
if searchCast:
for show in recs:
checkCast = show["cast"].lower() #the cast names from each show already in recs
print("checking if",searchCast,"is in ",checkCast)
if searchCast not in checkCast:
print("removing:",show["title"])
recs.remove(show)
else:
print("confirming that ",searchCast,"is in ",checkCast)
printList(recs)