1

I am a research analyst trying to collate data and perform analysis.I need data from this page . I need data of Abrasives to vanspati Oils (you'll find it on left side). I always encounter problems like this, I figured out that selenium will be able to handle such stuff. But I am stuck on how to download this data into Excel. I need one excel sheet for each category. My exact technical question is how do I address the problem of downloading the table data.I did a little bit of background research and understood that the data can be extracted if the table has class_name.from here. I see that the table has class="tbldata14 bdrtpg" So I used it in my code. I got this error

InvalidSelectorException: Message: The given selector tbldata14 bdrtpg is either invalid or does not result in a WebElement.

How can I download this table data? Point me to any references that I can read and solve this problem. My code:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

driver = webdriver.Firefox()

driver.get("http://www.moneycontrol.com/stocks/marketinfo/netprofit/bse/index.html")
elem=driver.find_element_by_class_name("tbldata14 bdrtpg")

Thanks in advance.Also please suggest if there is another simple way [I tried copy paste it is too tedious!]

raki
  • 11
  • 2
  • Can't you find the same kind of data at yahoo/google/bloomberg which have direct downloads to excel? – findwindow Apr 18 '16 at 22:05
  • I would like to have same business classification as that in moneycontrol. So it would be easy to directly download this data from moneycontrol page. – raki Apr 18 '16 at 22:08
  • You should refer to the Terms of Use for the moneycontrol web site to determine if this is even allowed by them. – Breaks Software Apr 18 '16 at 23:57
  • 1
    Error is because you are selecting an entire table. I suggest using BeatuifulSoup instead of selenium: https://www.crummy.com/software/BeautifulSoup/bs4/doc/. Although it's possible to adjust xpath to select elements (something like `driver.find_elements_by_xpath("//table[@class='tbldata14 bdrtpg']//a/b")` - if I didn't mess it up, it should give you list of companies) – timbre timbre Apr 19 '16 at 03:07

1 Answers1

0

Fetching the data you're interesting in can be achieved as following,

from selenium import webdriver

url = "http://www.moneycontrol.com/stocks/marketinfo/netprofit/bse/index.html"

# Get table-cells where the cell contains an anchor or text   
xpath = "//table[@class='tbldata14 bdrtpg']//tr//td[child::a|text()]"

driver = webdriver.Firefox()    
driver.get(url)
data = driver.find_elements_by_xpath(xpath)

# Group the output where each row contains 5 elements
rows=[data[x:x+5] for x in xrange(0, len(data), 5)]
for r in rows:
    print "Company {}, Last Price {}, Change {}, % Change {}, Net Profit {}" \
        .format(r[0].text, r[1].text, r[2].text, r[3].text, r[4].text)

Writing the data to an excel file is explained here,

Community
  • 1
  • 1
sowa
  • 1,249
  • 16
  • 29