Having difficulties with understanding some code in python

Question

I'm having some issues and just want to know what it may be because I've tested properly some of my code. Why is this code not working ? This is the code displayed below.

search_for_term = re.findall(r'<td class="kx o_\d.*data-bookmaker', doc)

This is the output of searc_for_term variable:

['<td class="kx o_1 winner" data-bookmaker',
 '<td class="kx o_0" data-bookmaker',
 '<td class="kx o_2" data-bookmaker']

Now I'm trying to find if any string contains word "winner". Code is shown below.

winner_ids = np.where([re.findall('winner', item) for item in search_for_term])

And now is the code which confuses me :

if(not all(winner_ids)):
   print("no winner")
else:
   print("winner does exist")

The output I get is "no winner". Can somebody explain this to me. I would be more than greatful.

What exactly do you not understand? Why did you expect to get something else as output? — mkrieger1, Nov 26 '19 at 00:05
Well, should it display winner does exist because winner is located in winner_ids judging by the array ? — newnick988888, Nov 26 '19 at 00:16
What if you print `winner_ids` to find out if your assumption is correct? — mkrieger1, Nov 26 '19 at 00:19
Please include all relevant code and data. See: [mcve]. It would be particularly useful here since some of these design choices seem odd. Also, it looks like you're using RegEx to parse HTML. I'm guessing you haven't seen [this legendary answer](https://stackoverflow.com/a/1732454/11301900) yet. — AMC, Nov 26 '19 at 00:23

Dan D. · Answer 1 · 2019-11-26T00:37:27.080

0

Don't do this: RegEx match open tags except XHTML self-contained tags

Use the beautiful soup.

Then you can simply use:

winner = doc.find('td.winner[data-bookmaker]')

And the condition becomes:

if winner:
    print("winner exists")
else:
    print("no winner")

Reply to comment: You extract a copy of the DOM and feed to this.

You already are extracting the DOM as HTML and parsing it via RE. You might as well simply feed it to bs4.

edited Nov 26 '19 at 00:37

answered Nov 26 '19 at 00:25

Dan D.

73,243
15
104
123

I cannot use beautiful soup because website does not allow me to. I can only use selenium. – newnick988888 Nov 26 '19 at 00:27

score 0 · Answer 2 · answered Nov 26 '19 at 00:25

0

i think you need

winner_ids = np.where([re.findall('.*winner.*', item) for item in search_for_term])

answered Nov 26 '19 at 00:25

Peter Moore

1,632
1
17
31

It didn't help me, but I solved the problem using is None in if conditional statement – newnick988888 Nov 26 '19 at 01:04
if you post your answer it may clarify things. Thanks. – Peter Moore Nov 26 '19 at 02:52

Having difficulties with understanding some code in python

2 Answers2