0

I am playing around with web scraping and Tor.

I managed to make it work with both requests and Selenium + PhantomJS. However, I need that the Tor browser is opened for the script to work.

This is why I am trying now to automatise the complete process; that is: open Tor browser automatically, run some script and at the end close the browser automatically. But I am struggling with it.

#open Tor browser
os.system('open /Applications/TorBrowser.app')

#code to scrape

#close Tor browser
???

Open

To open the browser, some other options I found out there are not working.

import subprocess
subprocess.Popen('/Applications/TorBrowser.app') #permission denied

or

os.system('start /Applications/TorBrowser.app') #sh: start: command not found

However, the following line worked:

os.system('open /Applications/TorBrowser.app')

Close

The main problem is to close the browser afterwards, as none of the commands found in other posts worked.

Those include:

os.system("taskkill /im /Applications/TorBrowser.app /f") #sh: taskkill: command not found

or

os.system("kill /Applications/TorBrowser.app") #sh: line 0: kill: /Applications/TorBrowser.app: arguments must be process or job IDs

or

os.close('/Applications/TorBrowser.app') #TypeError: an integer is required (got type str)

  • Any suggestions of how to close it?

  • And is there a better way to open it?

Edit: I'm on Mac with Python 3.

J0ANMM
  • 7,849
  • 10
  • 56
  • 90

2 Answers2

4

This worked for me:

from selenium import webdriver
import os
import subprocess
#start Tor
sproc=subprocess.Popen('"C:\\Users\\My name\\Desktop\\Tor Browser\\Browser\\firefox.exe"' )

#start PhantomJS
service_args = [ '--proxy=localhost:9150', '--proxy-type=socks5', ]
driver = webdriver.PhantomJS(service_args=service_args)
#get page
driver.get("https://stackoverflow.com/questions/40161921/how-to-open-and-close-tor-browser-automatically-with-python")
print(driver.page_source)
driver.close()
#kill process
sproc.kill()

I think you should add some time pauses between commands:

import time
time.sleep(20)# wait 20 seconds 

Another way to open Tor:

os.system('"C:\\Users\\My Name\\Desktop\\Tor Browser\\Browser\\firefox.exe"' )

But this time your command will wait until the called process stops himself (may be user will close it). According to your question it is not what you want. To control executing process let it runs and use special variable to kill it whenever you want.

Also pay attention to string path: double quotes inside single quotes. There are other ways to pass strings with spaces to system commands, for example: running an outside program (executable) in python?.

Community
  • 1
  • 1
Alexander Borochkin
  • 4,249
  • 7
  • 38
  • 53
  • Thanks for your answer @Alexander. The firefox.exe is confusing me. Is this supposed to work on Mac, or just on Windows? – J0ANMM Oct 20 '16 at 19:49
  • OK, thanks. I missed that part in question. Now it is edited. Any idea of how to do it in Mac? – J0ANMM Oct 21 '16 at 06:41
1

Try this in jupyter:

import webbrowser
urL='https://YOUR WEBSITE ADDRESS HERE'
mozilla_path="C:\\Users\\T14s\\Desktop\\Tor Browser\\Browser\\firefox.exe"
webbrowser.register('firefox', None,webbrowser.BackgroundBrowser(mozilla_path))
webbrowser.get('firefox').open_new_tab(urL)
import os
import time
time.sleep(10)
os.system("taskkill /im firefox.exe /f")

TOR is based on firefox - hence firefox comes up a lot.

Pierre Bonaparte
  • 623
  • 6
  • 17