2

I'm trying to locate search boxes on websites using the Mechanize python package to find forms on web pages. Pretty much every website defines these forms in their own way so I need to search for a bunch of different signatures. Because the Mechanize Browser.select_form function throws an exception whenever it fails to find the specified form, looking for a lot of different forms turns into a long list of try and except statements.

The first thing I tried(or rolled into) is the following structure. It works, but 1: it doesn't look very good, 2: expands badly(if I need even more statements this turns into chaos) and 3: overal this just seems like bad code.

from mechanize import Browser
br = Browser()
br.open(url)
try:
    br.select_form(id=lambda x: 'search' in x)
except Exception:
    try:
        br.select_form(class_=lambda x: 'search' in x)
    except Exception:
        try:
            br.select_form(action=lambda x: 'search' in x)
        except Exception:
            try:
                br.select_form(role=lambda x: 'search' in x)
            except Exception:
                print('NOTHING FOUND')
                pass

A possibly slightly better solution would to direct the except clauses to functions, as in https://stackoverflow.com/a/6095782/11309912. This would solve the sideways expansion but still consists of a lot of repeated code.

To me the ideal solution would be to have a list of statements I could iterate over until one type of form was found. A very crude example would be:

forms = ['id=lambda x: 'search' in x', 'class_=lambda x: 'search' in x', .....]
for form in forms:
    try:
        br.select_form(form)
        break
    except Exception:
        pass

Is something similar to this possible?

SB18
  • 33
  • 3

3 Answers3

5

The only thing that's variable there is the name of the keyword argument passed to select_form, and you can pass variable keywords like this:

for attr in ('id', 'search', 'class_', 'role'):
    try:
        form = br.select_form(**{attr: lambda x: 'search' in x})
        break
    except:
        pass
else:
    print('NOTHING FOUND')
deceze
  • 510,633
  • 85
  • 743
  • 889
  • Is it not bad to handle a bare `Exception` instead of the right exception, something like `except ValueError:` (for ex)? – Austin Jun 05 '19 at 08:25
  • Absolutely, you *should* catch a specific error there. OP is catching `Exception`, which is just as bad, so I omitted it entirely. I'm not sure what exceptions `select_form` will raise exactly, so can't fill in that part. – deceze Jun 05 '19 at 08:26
0

I'm not sure about with mechanize, but I know it's possible with selenium. More or less exactly like the example you used, actually. I won't be using lambda in this next example but it'll give the same effect just a bit slower. Assume driver is the variable pointer name for my browser.

listOfPossibleFields = ["user", "username", "un", "name", "login"]
for word in listOfPossibleFields:
    try:
        driver.find_element_by_name(word)
    except Exception:
         pass
Arne
  • 17,706
  • 5
  • 83
  • 99
MegaEmailman
  • 505
  • 3
  • 11
0

If you want something more generic, you can create a class for each of the search patterns and then to iterate thought a list of those classes instances.

class IdSearchPattern(object):
    def search(self, *args, **kwargs):
        ...

class RoleSearchPattern(object):
    def search(self, *args, **kwargs):
        ...

search_patterns = [IdSearchPattern(), RoleSearchPattern()]
for sp in search_patterns:
    try:
        result = sp.search()
        break
    except Exception:
        pass

Sometimes this is a good solution and sometimes it is a little bit of overdesign.

Note: I wrote this answer from my phone, code is not tested.

LazyGoose
  • 377
  • 1
  • 8