1

The pages are basically jpegs that open when I click on the window. Till now, I have been able to parse the website and collect all the page links in one list. Now, I want to open the JPGs and download them. But I am not sure how to deal with pop up windows.

r = requests.get('http://www.assamtribune.com/scripts/at.asp?id=mar0217/Page6')
c = r.content
soup = BeautifulSoup(c,'lxml')
maximusdooku
  • 5,242
  • 10
  • 54
  • 94
  • BeautifulSoup is for parsing HTML, it doesn't display images. A simple way to display images is with the 3rd-party Pillow library, which is the modern fork of the old PIL library. See http://pillow.readthedocs.io/en/3.1.x/index.html and http://pillow.readthedocs.io/en/3.1.x/reference/Image.html#PIL.Image.Image.show – PM 2Ring Mar 06 '17 at 10:17

2 Answers2

3

You can't open popups with BeautifulSoup. BS is used for parsing pages not for emulating clicks w/o in pages.

What you can do is follow the responses until you reach the image that you want.

Note this:

1) You request the url

2) There is a iframe which calls another request - check the iframe src. You'll notice that if put that link in your url it opens the page that you.

3) The page request in the frame calls an html file. Thats not what you want. You want the image. Check the source and you'll verify that the right part of the direct link to the image is similar to the frame src link.

4) Use requests to request the page and download the file.

Check this example code (I've started at point 2 in the above list).

from bs4 import BeautifulSoup
import requests
import os 

r = requests.get('http://www.assamtribune.com/scripts/PageAT.asp?id=2017/mar0217/Page6')
c = r.content
soup = BeautifulSoup(c,'lxml')

image = soup.find("img")["src"][3:]

 r = requests.get("http://www.assamtribune.com/%s" % image.replace("Page", "BigPage"), stream=True)
if r.status_code == 200:
    with open(os.getcwd() + "\\" + image.split("/")[-1], 'wb') as f:
        f.write(r.content)

I'll let you find the frame src and connect that into the code I provided. Have fun coding!

Zroq
  • 8,002
  • 3
  • 26
  • 37
0

I believe BeutuifulSoup won't help you but you could try the selenium module. Try

driver.switch_to_window("windowName")

But there a caveats with navigating pop-ups. See this stack post.

Selenium is documented here.

Community
  • 1
  • 1
Jean Zombie
  • 607
  • 1
  • 9
  • 28