Python regex - multiple search

Question

Here is what I'm trying to accomplish:

And the extracted code:

m = re.search('<td>(?P<alt>\d+)', response.read())
...
m = re.search('<td>(?P<alt>\w+)', response.read())
print m.group('alt')

I'm getting:

AttributeError: 'NoneType' object has no attribute 'group'

If I uncomment the second search everything is fine. I don't understand this behaviour.

Such an error redirected me to this stackoverflow issue and to this - but to no avail - neither of these solved my problem.

I don't care about efficiency here so I don't use compile.

What is the unfiltered result of each response.read()? I'm betting the second read isn't returning what you expect. — cmaynard, Feb 07 '11 at 17:38
Could you add some more details about what you are trying to do by calling re.search twice? The current example code makes no sense. — shang, Feb 07 '11 at 17:45
@kramthegram - thanks! You're right. It wasn't regex issue. @shang - because response.read() changes beetween these 2 lines - vide second point of my question. — laszchamachla, Feb 07 '11 at 17:48

Reiner Gerecke · Accepted Answer · 2011-02-07T17:50:24.150

2

Assuming response is a file-like object, calling read a second time might return a empty string as you consumed the file before.

data = response.read()
m = re.search('<td>(?P<alt>\d\d*)', data)
m = re.search('<td>(?P<alt>\d\d*)', data)
print m.group('alt')

Why would you call search multiple times?

edited Feb 07 '11 at 17:50

answered Feb 07 '11 at 17:38

Reiner Gerecke

You're right - thanks! So it wasn't regex issue. My mistake. I would like call search multiple times, because data might change between these two lines (second point of my question). – laszchamachla Feb 07 '11 at 17:48
@laszchamachla In that case, I don't see how this is any help. If I understand you correctly, you're getting page A, search on its data, in case of no matches, you do a new request and search on that data. There shouldn't be a problem if between two searches, you issue a new request and get a new response. – Reiner Gerecke Feb 07 '11 at 17:55
@Reiner - exactly, it is pretty strange to me too. But, as you adviced, asigning response.read() to variable before every search solves the problem. – laszchamachla Feb 07 '11 at 18:03
Also I'd suggest to compile the regex once: `rx = re.compile('(?P\d\d*)')` and then re-use it wherever needed: `m = rx.search(data)`. – 9000 Feb 07 '11 at 18:04
@9000 - I wrote: "I don't care about efficiency here so I don't use compile." - it is not the point in this case, but thanks for your suggestion. – laszchamachla Feb 07 '11 at 18:06
@laszchamachla: besides efficiency, there's maintainability: you only need to change the regexp once if you find a bug in it. but you can just use a string constant, of course :) – 9000 Feb 07 '11 at 18:27
@9000 - Thanks, I know that :) It was only an example - in fact I use different regexes to aforementioned searches. – laszchamachla Feb 07 '11 at 18:33

1 Answers1