2

I am trying to download the 24-month data from www1.nseindia.com and it fails on Chrome and Firefox drivers. It just freezes after filling all the values in the required places and does not click. The webpage does not respond...

Below is the code that I am trying to execute:

import time
from selenium import webdriver
from selenium.webdriver.support.ui import Select

id_list = ['ACC', 'ADANIENT']

# Chrome
def EOD_data_Chrome():
    driver = webdriver.Chrome(executable_path="C:\Py388\Test\chromedriver.exe")
    driver.get('https://www1.nseindia.com/products/content/equities/equities/eq_security.htm')
    s1= Select(driver.find_element_by_id('dataType'))
    s1.select_by_value('priceVolume')
    s2= Select(driver.find_element_by_id('series'))
    s2.select_by_value('EQ')
    s3= Select(driver.find_element_by_id('dateRange'))
    s3.select_by_value('24month')
    driver.find_element_by_name("symbol").send_keys("ACC")
    driver.find_element_by_id("get").click()
    time.sleep(9)
    s6 = Select(driver.find_element_by_class_name("download-data-link"))
    s6.click()

# FireFox(Gecko)
def EOD_data_Gecko():
    driver = webdriver.Firefox(executable_path="C:\Py388\Test\geckodriver.exe")
    driver.get('https://www1.nseindia.com/products/content/equities/equities/eq_security.htm')
    s1= Select(driver.find_element_by_id('dataType'))
    s1.select_by_value('priceVolume')
    s2= Select(driver.find_element_by_id('series'))
    s2.select_by_value('EQ')
    s3= Select(driver.find_element_by_id('dateRange'))
    s3.select_by_value('24month')
    driver.find_element_by_name("symbol").send_keys("ACC")
    driver.find_element_by_id("get").click()
    time.sleep(9)
    s6 = Select(driver.find_element_by_class_name("download-data-link"))
    s6.click()


EOD_data_Gecko()

# Change the above final line to    "EOD_data_Chrome()" and still it just remains stuck...

Kindly help with what is missing in that code to download the 24-month data... When I perform the same in a normal browser, with manual clicks, it is successful...

When you are manually doing it in a browser, you can change the values as below:

Set first drop down to : Security wise price volume data
"Enter Symbol"  :  ACC
"Select Series"   :  EQ
"Period" (radio button: "For Past") : 24 Months

Then click on the button, "Get Data", and in about 3-5seconds, the data loads, and then when you click on "Download file in CSV format", you can have the CSV file in your downloads

Need help using any library you know for scraping in Python: Selenium, Beautifulsoup, Requests, Scrappy, etc... Doesn't really matters unless it is python...

Edit: @Patrick Bormann, pls find the screenshot... The get data button works.. Get_Data_Button works

Lokkii9
  • 75
  • 1
  • 15

2 Answers2

1

When you say that it works manually, have you try to simulate a click with action chains instead of the internal click function

from selenium.webdriver.common.action_chains import ActionChains
easy_apply = Select(driver.find_element_by_id('dateRange'))
actions = ActionChains(driver)
actions.move_to_element(easy_apply)
actions.click(easy_apply)
actions.perform()

and then you simulate a mouse movement to the specific value?

In addition, I tried it on my own and I didnt get any data when pushing on the button Get Data, as it seems to have a class of "get" as you mentioned, but this button doesnt work, but as you can see there exists a second button called full download, perhaps ypu try to use this one? Because the GetData Button doesnt work on Firefox and Chrome (when i tested it).

Did you already try to catch it through the link?enter image description here

Update

As the OP asks for help in this urgent matter I delivered a working solution.

from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
import time
from selenium.webdriver.support.ui import Select


chrome_driver_path = "../chromedriver.exe"


driver = webdriver.Chrome(executable_path=chrome_driver_path)
driver.get('https://www1.nseindia.com/products/content/equities/equities/eq_security.htm')
driver.execute_script("document.body.style.zoom='zoom 25%'")
time.sleep(2)
price_volume = driver.find_element_by_xpath('//*[@id="dataType"]/option[2]').click()
time.sleep(2)
date_range = driver.find_element_by_xpath('//*[@id="dateRange"]/option[8]').click()
time.sleep(2)
series = driver.find_element_by_name('series')
time.sleep(2)
drop = Select(series)
drop.select_by_value("EQ")
time.sleep(2)
driver.find_element_by_name("symbol").send_keys("ACC")
ez_download = driver.find_element_by_xpath('//*[@id="wrapper_btm"]/div[1]/div[3]/a')
actions = ActionChains(driver)
actions.move_to_element(ez_download)
actions.click(ez_download)
actions.perform()

Here you go, sorry, took a little, had to bring my son to bed... This solution provides this output: nse_outputI hope its correct. If you want to select other drop down menus you can change the string in the select (string because of too much indezes too handle) or the number in the xpath as the number highlights the index. The time is normally only for elements which need time to build themselves up on a webpage. But I made the experience that a too fast change sometimes causes errors. Feel free to change the time limit and see if it still works.

I hope you can now go on again in making some money for your living in India. All the best Patrick,

Do not hesitate to ask if you have any questions.

UPDATE2

After one long night and another day we figured out that the Freezing originates from the website, as the website uses:

Boomerang | Akamai Developer developer.akamai.com/tools/… Boomerangis a JavaScript library forReal User Monitoring (commonly called RUM). Boomerang measures the performance characteristics of real-world page loads and interactions. The documentation on this page is for mPulse’s Boomerang. General API documentation for Boomerang can be found atdocs.soasta.com/boomerang-api/. . What I discovered from the html header.

This is clearly a bot detection network/javascript. With the help of this SO post: Can a website detect when you are using Selenium with chromedriver?

And the second paragraph from that post:https://piprogramming.org/articles/How-to-make-Selenium-undetectable-and-stealth--7-Ways-to-hide-your-Bot-Automation-from-Detection-0000000017.html

I finally solved the issue: we changed the

var_key in chromedriver to something else like:

var key = '$dsjfgsdhfdshfsdiojisdjfdsb_';

In addition I changed the code to:

import time
from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.support.ui import Select
from selenium.webdriver.chrome.options import Options
options = webdriver.ChromeOptions()

chrome_driver_path = "../chromedriver.exe"
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
options.add_argument('--disable-blink-features=AutomationControlled')

driver = webdriver.Chrome(executable_path=chrome_driver_path, options=options)
driver.execute_script("Object.defineProperty(navigator, 'webdriver', {get: () => undefined})")


driver.get('http://www1.nseindia.com/products/content/equities/equities/eq_security.htm')
driver.execute_script("document.body.style.zoom='zoom 25%'")
time.sleep(5)
price_volume = driver.find_element_by_xpath('//*[@id="dataType"]/option[2]').click()
time.sleep(3)
date_range = driver.find_element_by_xpath('//*[@id="dateRange"]/option[8]').click()
time.sleep(5)
series = driver.find_element_by_name('series')
time.sleep(3)
drop = Select(series)
drop.select_by_value("EQ")
time.sleep(4)
driver.find_element_by_name("symbol").send_keys("ACC")
actions = ActionChains(driver)
ez_download = driver.find_element_by_xpath('/html/body/div[2]/div[3]/div[2]/div[1]/div[3]/div/div[1]/form/div[2]/div[3]/p/img')
actions.move_to_element(ez_download)
actions.click(ez_download)
actions.perform()
#' essential because the button has to be loaded
time.sleep(5)
driver.find_element_by_class_name('download-data-link').click()

The code finally worked and the OP is happy.

Patrick Bormann
  • 729
  • 6
  • 16
  • Probably what you are doing is trying that link in the same driver related browser, which is sure not to work... Or maybe the NSE server doesn'tt work outside of the country that NSE operates in... Either way, if "full download" is working for you, then you are 'not' having any restrictions based on your location, for sure... – Lokkii9 Mar 10 '21 at 00:29
  • One more thing I suspect is you are not putting in the Scrip name(stock name) for which you would like to download the 24months data... like in the US you have apple or microsoft.... In here, I mean in NSE, you can try the following stocks: ACC , RELIANCE , BRITANNIA , CASTROLIND , PFIZER , BOSCHLTD – Lokkii9 Mar 10 '21 at 00:39
  • BTW plss check the screenshot which I have added to the question – Lokkii9 Mar 10 '21 at 01:05
  • 1
    I'll try to work on it late eve. I'll come back to you. – Patrick Bormann Mar 10 '21 at 09:32
  • 1
    I'll start now and come back when I have something to work with, keep tight – Patrick Bormann Mar 10 '21 at 17:59
  • 1
    I'm done and my solution works. Please look at it. If everything is OK I would be happy if you would mark my answer as CORRECT and gave it an upvote. – Patrick Bormann Mar 10 '21 at 18:46
  • Thank you soooo much for taking time on this Patrick.. I am deeply indebted to you.. I am travelling at the moment, so I will check the code as soon as I return... : ) – Lokkii9 Mar 10 '21 at 19:44
  • 1
    If there is still an error do not hesitate to come back to me! I'm happy I could help! – Patrick Bormann Mar 10 '21 at 21:18
  • Hi Patrick, I just tried the code and it is not working... after you put the "ACC" code, you are not directing the code to "Get Data" button, but you are doing the "Full Download" which is not useful at all... "Full Download" gives data of only one day, that is yesterday, for all the stocks in the Market... but when you hit the "Get Data" button, it retrieves data for only one stock for 24months, which is important for Technical Analysis for predicting the price of it in the near future.. – Lokkii9 Mar 13 '21 at 18:54
  • Is it possible to pls modify the code to emulate the mouse actions to the "Get Data" button, and to successfully retrieve the 24months data for one stock of "ACC" – Lokkii9 Mar 13 '21 at 18:56
  • As you have seen in the screenshot, which is attached in my question(at the end), I need the 24month data for only one stock and then there is the button "Download File in CSV Format" with an excel symbol, just about the data... Initially in the question, you can see, that I have mentioned that the Chrome/Gecko driver freezes when I make it click the "Get Data" Button, because probably it is detecting scrapping activities... And it is jamming the driver and there is no response... Can there be any code in python with "Requests" library or is there any other scrapping with "BeautifulSoup" – Lokkii9 Mar 13 '21 at 19:30
  • The initial code in the question already acheives everything you have done here, and also emulates the code to hit the getdata button, and also to download the csv file after data appears, but then the driver freezes at hitting "Get Data" button – Lokkii9 Mar 13 '21 at 19:31
  • Someone did some research in the past and had mentioned that there has to be a problem with the headers an automated browser sends, and the headers/send-request which a normal browser sends... The server is able to recognize that it is an automated activity, and hence it is probably not responding... There has to be a cleverer way to handle it, or there has to be a different library which can achieve it.. I am totally unsure of either – Lokkii9 Mar 13 '21 at 19:35
  • 1
    Did it work before? have you tried an older chromeversion and the corresponding chromedriver? I read that during changes in chrome some appliactions on selenium that worked before now crashed – Patrick Bormann Mar 13 '21 at 19:38
  • The code worked 2yrs ago, when the server was not clever to differentiate an automated activity, probably scrapping activity with drivers and such.. and hence when I tried the code in the last 6months, it has blocked me, or the driver freezes – Lokkii9 Mar 13 '21 at 19:39
  • If I open a normal browser, and do the same with mouse clicks, it successfully allows me to get the data.. only from a normal browser... without any code... but if I open the python/driver-browser., and do it with mouse(without code), it still freezes... – Lokkii9 Mar 13 '21 at 19:44
  • So I suspected the way a selenium browser interacts with server is different from a normal browser, and I tried to download the latest version of selenium+drivers, and tried recent python version.. but nothing works with my previous code, which is perfectly emulating the entire process...that code which I wrote would be ok if the selenium browser wouldn't freeze... – Lokkii9 Mar 13 '21 at 19:49
  • I am comfortable with a solution which may not use selenium at all... any library of your comfort would be great – Lokkii9 Mar 13 '21 at 19:50
  • 1
    Ill try it with the old chrome version and driver 71 from 2019 if this doesnt work perhaps we have to switch to scrapy – Patrick Bormann Mar 13 '21 at 19:58
  • I would be ever grateful for your attempts Patrick.. Pls.. Kindly help me with any other library if possible, because the person who researched this problem earlier for me, while I had a job, figured out its with server detecting selenium driver, and hence, only some other library would work , but not selenium.. – Lokkii9 Mar 13 '21 at 20:03
  • 1
    Unfortunately scrapy can not simulate the click but! what I encountered on my tests is that it didnt crash when hitting the button. it crashes when inserting the ACC directly. Because when commenting out that part, he gives you the hint, that you have to fill something and the website does not crash. Thus it is not due to the button. Because this remainder comes WHEN! Hitting the button. I will try to simulate a mouseover to the ACC field and see if it works. – Patrick Bormann Mar 13 '21 at 20:15
  • I sincerely hope you are right about the 'hint' aspect of the page, because in all the scrapping libraries(like beautifulsoup, scrapy..) the only one I know is Selenium... Is there a way to disable features of selenium-driver/browser, like disabling hints/comments from showing up... Disabling certain features in the browser, may help.. some settings I used with gecko are below – Lokkii9 Mar 13 '21 at 20:27
  • fp = webdriver.FirefoxProfile() fp.set_preference("browser.preferences.instantApply",True) fp.set_preference("browser.helperApps.neverAsk.saveToDisk", "text/plain, application/octet-stream, application/binary, text/csv, application/csv, application/excel, text/comma-separated-values, text/xml, application/xml") fp.set_preference("browser.helperApps.alwaysAsk.force",False) fp.set_preference("browser.download.manager.showWhenStarting",False) fp.set_preference("browser.download.folderList",0) – Lokkii9 Mar 13 '21 at 20:27
  • but it doesnt have any feature disabling line of code, which maybe very advanced... – Lokkii9 Mar 13 '21 at 20:27
  • pls let me know if you would like to convert this into a chat conversation...that way we can communicate better or faster I guess.. – Lokkii9 Mar 13 '21 at 20:32
  • 1
    Yeah lets do this, I have not found a solution yet although I came up with 6 different ideas now ..^^ – Patrick Bormann Mar 13 '21 at 20:40
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/229871/discussion-between-lokkii9-and-patrick-bormann). – Lokkii9 Mar 13 '21 at 20:42
1

I have edited the chromedriver.exe using hex editor and replaced cdc_ with dog_ and saved it. Then executed the below code using chrome driver.

            import selenium
            from selenium import webdriver
            from selenium.webdriver.support.select import Select
            import time

            options = webdriver.ChromeOptions() 
            options.add_argument("start-maximized")
            options.add_argument("--disable-blink-features")
            options.add_argument('--disable-blink-features=AutomationControlled')
            options.add_experimental_option("excludeSwitches", ["enable-automation"])
            options.add_experimental_option('useAutomationExtension', False)
            driver = webdriver.Chrome(options=options)
            driver.execute_script("Object.defineProperty(navigator, 'webdriver', {get: () => undefined})")
            driver.execute_cdp_cmd('Network.setUserAgentOverride', {"userAgent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36'})
            print(driver.execute_script("return navigator.userAgent;"))

            # Open the website
            driver.get('https://www1.nseindia.com/products/content/equities/equities/eq_security.htm')
            symbol_box = driver.find_element_by_id('symbol')
            symbol_box.send_keys('20MICRONS')
            driver.implicitly_wait(10)
            #rd_period=driver.find_element_by_id('rdPeriod')
            #rd_period.click()

            list_daterange=driver.find_element_by_id('dateRange')
            list_daterange=Select(list_daterange)
            list_daterange.select_by_value('24month')
            driver.implicitly_wait(10)
            btn_getdata=driver.find_element_by_xpath('//*[@id="get"]')
            btn_getdata.click()
            driver.implicitly_wait(100)
            print("Clicked button")
            lnk_downloadData=driver.find_element_by_xpath('/html/body/div[2]/div[3]/div[2]/div[1]/div[3]/div/div[3]/div[1]/span[2]/a')
            lnk_downloadData.click()

This code is working fine as of now. But the problem is that - this is not a permanent solution. NSE keeps on updating the logic to detect BOT execution in a better way. Like NSE, we will also have update our code. Please let me know if this code is not working. Will figure out some other solution.

Pratik
  • 11
  • 2
  • 1
    Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Dec 27 '21 at 16:12
  • Thanks Pratik.. I have found another solution which totally eliminates the use of Selenium, but I am sure that your code works fine.. I appreciate it... – Lokkii9 Feb 26 '22 at 07:03