I have just started with python and crawling. I used selenium and beautifulsoup to parse and crawl pages.My question is if I have to make an application out of this or deploy it on some iis server(or any other way which I am not aware,like how it works) I am not even sure if it works that way as I am putting it in words from a mobile application approach :) I need two external things apart from my installed packages. Chrome driver and chrome binary. Chrome driver is available, but chrome binary is something which I get only after installing chrome. Among the few of the things I googled, I found out that I need to use docker to ship it. So is it that chrome must be installed on the server for this crawler to work ?? How do I ship that binary with application ? Again I am using linux, so the server on which this will be deployed must be of linux ?? Or how do I achieve it.Can some one help me with the approach to create an application of this crawler and give it to someone else ? Thank you :)
Asked
Active
Viewed 125 times
3
-
1you can use docker by creating a yaml file which will create machine with requirements you have e,g, chrome with version, chrome driver version, python etc all these are available in docker repo. – Amit Jain Dec 11 '19 at 07:10
-
1@AmitJain thank you. Can you please provide me some links as a way forward. I have tried googling it a lot, but may be I am not getting right keywords to google as I don't know much about it :) – Pritish Dec 11 '19 at 07:14
-
1https://docs.docker.com/compose/gettingstarted/ – Amit Jain Dec 11 '19 at 07:25
1 Answers
1
Okay so I tried this myself and it worked. Well servers are nothing but VMs right?
So what you can do is convert the script to an executable or just keep it without packing. Also, keep the chrome driver in the same directory and then run it with python
.
And also if you are using a fresh VM then just install chrome there. If your VM is running on Ubuntu then you can type the following to install chrome.
wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
sudo dpkg -i google-chrome-stable_current_amd64.deb
You can comment for any further clarification.

Debdut Goswami
- 1,301
- 12
- 28
-
Thank you, will try and let you know in couple of days, btw did you face any issue while logging in to windows authentication(pop up for login) with selenium ?? I am unable to log in with all those available options on google,on another site. – Pritish Dec 12 '19 at 05:24
-
-
Do you use chrome or firefox ? Btw which method did you use for pop up (basic auth ) login ? – Pritish Dec 12 '19 at 05:27
-
-
https://stackoverflow.com/questions/24304752/how-to-handle-authentication-popup-with-selenium-webdriver-using-java – Pritish Dec 12 '19 at 05:28
-
https://stackoverflow.com/questions/29516740/how-to-access-popup-login-form-with-selenium-in-python – Pritish Dec 12 '19 at 05:29
-
Oh I got what you are trying to say. Apparently, all the websites that I have scraped or automated, none of them had pop up authentication so I have no experience on it. Well this seems very interesting. I'll surely check it out and if I am able to overcome that efficiently then I'll let you know. – Debdut Goswami Dec 12 '19 at 05:34
-
Thank you, actually I am able to do it using requests, very easily, but failing with selenium. – Pritish Dec 12 '19 at 05:34
-
I checked the answers. Are you doing `http://username:password@pentesteracademylab.appspot.com/lab/webapp/digest/1' ? – Debdut Goswami Dec 12 '19 at 05:35
-
yes, attempted and failed, please have a look at this answer https://stackoverflow.com/questions/7022116/how-to-submit-http-authentication-with-selenium-python-binding-webdriver can we do something like this ? – Pritish Dec 12 '19 at 05:36
-
Okay after a lot of research what I found out is currently no browser supports automating pop up auth and chrome never did actually. So, workaround is downgrading Mozilla version. The following Code works fine using any mozilla version < 67.0 (55.0+) – Debdut Goswami Dec 12 '19 at 06:45
-
```driver = webdriver.Firefox(options=chrome_options) driver.get(url) time.sleep(5) wait = WebDriverWait(driver,10) alert = wait.until(EC.alert_is_present()) alert.send_keys('username{Keys.TAB}password') alert.accept()``` – Debdut Goswami Dec 12 '19 at 06:45
-
-
or can you help me with reusing session ? I am able to add cookies from request to selenium driver, but it gives me incorrect domain error !! – Pritish Dec 12 '19 at 06:47
-
https://developer.mozilla.org/en-US/docs/Web/WebDriver/Errors/InvalidCookieDomain – Debdut Goswami Dec 12 '19 at 06:49
-
Hi I solved it , it was an encoding issue . I was using \ in user name. Please check this https://stackoverflow.com/questions/56251199/how-to-by-pass-ntlm-authentication-pop-up-while-performing-automation-testing-us Tried everything but this :) – Pritish Dec 16 '19 at 10:14
-
Hi @Pritish if this or any answer has solved your question please consider accepting it by clicking the check-mark. This indicates to the wider community that you've found a solution and gives some reputation to both the answerer and yourself. There is no obligation to do this – Debdut Goswami Jun 26 '20 at 12:12
-
hi, that whole thing is on hold, we are deploying directly to machine as of now, so didn't work on it. – Pritish Jul 01 '20 at 07:03