0

I was trying to web scrape data from a website but with no success. I have run tests in the terminal and I seem to have no problem with running a for loop to print one list. The problem comes when I try to use 2 variables in a for loop. I have tried to use zip but it doesn't seem to be working. Since I didn't know about how to use zip, I have checked other pages in StackOverflow but nothing seems to be working with my case. This is the code I came up with:

browser = webdriver.Chrome("C:\webdrivers\chromedriver.exe")
browser.get("https://www.worldometers.info/coronavirus/")
countries = browser.find_elements_by_tag_name("mt_a")
cases = browser.find_elements_by_tag_name("sorting_1")
[print(i.text, '-', j.text) for i, j in zip(countries, cases)]

When I tried running the program both from my IDE and terminal, nothing happened. Can anyone please help me solve this issue? All help appericiated.

no jif
  • 3
  • 4
  • There's a couple possible issues here. 1: Unless this is being run in a REPL, you need to explicitly `print` out the results of the last line. 2: If either `cases` or `countries` are empty, there won't be anything to iterate, and the last line won't do anything. Double check your data. – Carcigenicate Apr 29 '20 at 17:52
  • Please do not use the sideeffect of generating a list comprehension. They are there to build lists, not to replace simply for loops to print something. See [is-it-pythonic-to-use-list-comprehensions-for-just-side-effects](https://stackoverflow.com/questions/5753597/is-it-pythonic-to-use-list-comprehensions-for-just-side-effects) – Patrick Artner Apr 29 '20 at 18:37
  • Debug your code: `print(list(cases))` and `print(list(countries))` if either is empty nothing will be done as zip() only works up to the shortest lists length – Patrick Artner Apr 29 '20 at 18:38

2 Answers2

1

You should first test with easier data, like list1 = ['a', 'b'] and list2 = [11, 22].

How about:

list1 = ['a', 'b']
list2 = [11, 22]
for i, j in zip(list1, list2):
    print(i, j)

Then, I'm not sure about what you are expecting with the print inside of the list?

foo = [print(i, j) for i, j in zip(list1, list2)]
print('foo =', foo)

returns the following:

a 11
b 22
foo = [None, None]
Keldorn
  • 1,980
  • 15
  • 25
0

You, probably, need to use find_element_by_class_name method instead of find_elements_by_tag_name.

Here are some parts of the HTML from https://www.worldometers.info/coronavirus/ page

<td style="font-weight: bold; font-size:15px; text-align:left;">
  <a class="mt_a" href="country/us/">USA</a>
</td>
...
<td style="font-weight: bold; text-align:right" class="sorting_1">1,049,431</td>

Tags are: td, a

Classes are: mt_a, sorting_1

Serhii Kostel
  • 176
  • 1
  • 3