4

I got this warning WARNING: C:\Users\diodi\AppData\Local\Programs\Python\Python37-32\lib\site-packages\xlsxwriter\worksheet.py:923: UserWarning: Ignoring URL 'https://www.google.com/search?q=hello+world' since it exceeds Excel's limit of 65,530 URLS per worksheet. "65,530 URLS per worksheet." % force_unicode(url))

I use to write th output of scraped data

#spider.found_items is a list of dictionary
df = pd.DataFrame(spider.found_items)[ordered_list] #ordered_list is the order of dictiornary outputed
df.to_excel("{}.xlsx".format(file_name),sheet_name='All Products')

I checked this Number of URLS is over Excel's limit of 65,530 URLS per worksheet. but this formats the links as strings (not clickable). is there a way to keep URL (as a link) if I can write to multiple sheets or any other suggestion?

  • Did you checked [this](https://stackoverflow.com/a/31821781/8353711) – shaik moeed May 25 '19 at 08:11
  • Possible duplicate of [add hyperlink to excel sheet created by pandas dataframe to\_excel method](https://stackoverflow.com/questions/31820069/add-hyperlink-to-excel-sheet-created-by-pandas-dataframe-to-excel-method) – shaik moeed May 25 '19 at 08:11
  • @shaikmoeed I don't believe this would solve it, –  May 25 '19 at 08:20
  • @shaikmoeed this basically converts strings to urls, while excel doesn't allow more than "65,530" link per sheet –  May 25 '19 at 08:24

2 Answers2

2

Fundamentally this is an Excel limitation. It only allows 65,530 unique* urls per worksheet. There isn't any workaround for that.

(*) Excel does allow more than 65,530 non-unique urls if they are grouped contiguously. For example the same url in cells A1 to A100 would only be counted as one url against the 65,530 limit. However, grouping of urls to take advantage of this isn't supported by XlsxWriter (and even if it was it doesn't help if all the urls are unique or non-contiguous).

jmcnamara
  • 38,196
  • 6
  • 90
  • 108
1

If possible, use the EXCEL HYPERLINK function instead. You should not encounter this limitation with the HYPERLINK function.