0

Basically I have converted a tab delimited txt file into a list containing a bunch of lists for each book (title, author, publisher, etc) and I have figured out how to search for something using indexes, but how can I make it so it searches and returns anything that matches even partially.

import csv
import itertools

list_of_books = list(csv.reader(open('bestsellers.txt','rb'), delimiter='\t'))

search = 'Tom Clancy'
for sublist in list_of_books:
    if sublist[1] == search:
        print sublist

EG. So instead of having to search 'Tom Clancy' someone could enter 'clancy' and still get all the Tom Clancy novels.

Thanks.

sharkman
  • 175
  • 1
  • 5
  • 11

2 Answers2

1

I think this achieves what you're looking for:

search = 'Tom Clancy'
for sublist in list_of_books:
    if search in sublist[1]:
        print sublist

UPDATE:

I think you'll want to convert both strings to lower case too, like this:

if search.lower() in sublist[1].lower():
grc
  • 22,885
  • 5
  • 42
  • 63
0

This depends a bit on exactly what you mean by partially.

First definition: the search term should match exactly, but it can match at any point in the string. This is probably almost what you mean. In this case you really want to check if the sublist contains the search term. For this, you want to use Python's in operator:

if search in sublist[1]:
    print sublist

Because of the difference between equality and contains, this will be between slightly and very much slower. I doubt it matters to you.

Second definition: The same as the first, but case doesn't matter. In this case you want to normalize the case, basically just ignore upper or lower case by making them all the same, using Pythons lower (or upper) string methods.

search = 'Tom Clancy'
search_lower = search.lower() # move the search lowering
for sublist in list_of_books:
    # since strings are immutable, sublist[1].lower() creates a new lower-cased
    # string to be compared against search_lower. sublist[1] doesn't get modified
    if search_lower in sublist[1].lower():
        print sublist

That's probably what you want.

There's a third definition, which is "fuzzy matching". If you accept fuzzy matches, clincy might match Clancy. Heck, if the search is fuzzy enough tom can match Clancy. That's a whole 'nuther can of worms. Luckily, this Stack Overflow question has a whole bunch of libraries that can help with it.

Community
  • 1
  • 1
quodlibetor
  • 8,185
  • 4
  • 35
  • 48