12

I am reading an Excel file using xlrd. In one column I have a company name which is formatted as a hyperlink (meaning there is a URL behind it). When I get the cell value I only get the company name. How can I also get the URL behind it?

Below is the code for reading an Excel file using the xlrd module (assume files are imported).

mainData_book = xlrd.open_workbook("IEsummary.xls", formatting_info=True)
mainData_sheet = mainData_book.sheet_by_index(0) # Get the first sheet 0
start = 1
end = 101
for counter in range(start, end):
    rowValues = mainData_sheet.row_values(counter, start_colx=0, end_colx=8)
    company_name = rowValues[0] #how i can get link here also??
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Aamir Rind
  • 38,793
  • 23
  • 126
  • 164
  • please post some code so we can have a foundation to answer from! And try to fix the title to include relevant keywords. – DGM Aug 14 '11 at 12:59
  • 1
    @Aamir Adnan Added a link to an example file. Did I capture the structure correctly? Feel free to replace it with a link to an example file of yours. – phihag Aug 14 '11 at 13:10
  • @phihag: thanks, this now makes more sense to question:) (i dont know why i got negative vote on this question this is genuine question guys, please help generously) – Aamir Rind Aug 14 '11 at 13:13
  • @Aamir Adnan I'd wager the downvotes stem from the original unspecific question title and the lack of reproducability. Also, try to observe proper punctuation. The occasional mistaek is ignored, but it can be hard to read text without commas and full stops. – phihag Aug 14 '11 at 13:34

1 Answers1

11

In xlrd 0.7.2 or newer, you can use hyperlink_map:

import xlrd
mainData_book = xlrd.open_workbook("IEsummary.xls", formatting_info=True)
mainData_sheet = mainData_book.sheet_by_index(0)
for row in range(1, 101):
    rowValues = mainData_sheet.row_values(row, start_colx=0, end_colx=8)
    company_name = rowValues[0]

    link = mainData_sheet.hyperlink_map.get((row, 0))
    url = '(No URL)' if link is None else link.url_or_path
    print(company_name.ljust(20) + ': ' + url)
Saul
  • 992
  • 1
  • 13
  • 26
phihag
  • 278,196
  • 72
  • 453
  • 469
  • from where i can download the xlrd 0.7.2 version? – Aamir Rind Aug 14 '11 at 13:51
  • @Adamari Adnan I checked out the development version with `svn co https://secure.simplistix.co.uk/svn/xlrd/trunk/`. Looks like 0.7.2 isn't released yet. – phihag Aug 14 '11 at 13:58
  • note that this doesn't work with `.xlsx` files yet (version 1.1.0), [see here](https://stackoverflow.com/a/13914953/2098939) – Gerrit-K Jun 24 '18 at 09:21