It depends on how the hyperlinks are written in the original Excel files and on the Excel writer you use. read_excel
will return the display text, e.g. if you have a hyperlink to https://www.google.com
and the diplay text is just google
, then there's no way to retain the link with pandas as you'll have just google
in your dataframe.
If no separate display name is given (or the display name is identical with the hyperlink) and you use xlsxwriter
(engine='xlsxwriter'
), then the output of to_excel
is automatically converted to hyperlinks (because it starts with 'http://' or any other scheme) (as of xlsxwriter version 1.1.5).
If you know that all your hyperlinks are 'http://' links with no authority and the display name (if different from the link) is just the url path, then you can prepend the 'http://' suffix and you'll get hyperlinks in the Excel file:
combined.iloc[combined[~combined.iloc[:,0].str.startswith('http')].index,0] = 'http://' + combined.iloc[combined[~combined.iloc[:,0].str.startswith('http')].index,0]
combined.to_excel("c.xlsx", header=False, index=False, engine='xlsxwriter')
A universal solution without pandas using
openpyxl
is shown in
this answer to the same SO question where you took the pandas solution from. In order to copy hyperlinks too, you'll just have to add the following lines to the
copySheet
function:
if cell.hyperlink is not None:
newCell.hyperlink = copy(cell.hyperlink)