1

I am trying to read excel sheet(xlsx), which is using background color to differentiate values.

I tried following libraries:

  1. pandas, did not find any option to read background color based cells.
  2. xlrd.

    import xlrd
    xlrd.open_workbook("filename.xlsx", formatting_info=True)
    

It gives error as: NotImplementedError: formatting_info=True not yet implemented.

  1. StyleFrame (As Suggested by DeepSpace in: Subsetting a dataframe based on cell color and text color in excel sheet )

    from StyleFrame import StyleFrame, utils
    sf = StyleFrame.read_excel('filename.xlsx', read_style=True, use_openpyxl_styles=False)
    

It gives error as:

Traceback (most recent call last):
File "proj_path/read_excel.py", line 22, in <module>
sf = StyleFrame.read_excel('filename.xlsx', read_style=True, use_openpyxl_styles=False)
File "C:\Anaconda\lib\site-packages\StyleFrame\deprecations.py", line 22, in inner
return func(*args, **kwargs)
File "C:\Anaconda\lib\site-packages\StyleFrame\style_frame.py", line 220, in read_excel
_read_style()
File "C:\Anaconda\lib\site-packages\StyleFrame\style_frame.py", line 209, in _read_style
read_comments and current_cell.comment)
File "C:\Anaconda\lib\site-packages\StyleFrame\styler.py", line 127, in from_openpyxl_style
font_color = theme_colors[openpyxl_style.font.color.theme]
TypeError: list indices must be integers or slices, not Integer

Any suggestion to help me move to correct direction is highly appreciated.

Manvi
  • 1,136
  • 2
  • 18
  • 41
  • Do you want to only import the colored cells? What error do you get when importing with pandas? can you edit it into your post? – Juan C Sep 05 '18 at 18:48
  • Yes, I do not find any option to read colored cell in Pandas. – Manvi Sep 05 '18 at 18:49
  • Try converting the file to `filename.xls` instead of `xlsx` for XLRD – Nikhil Fadnis Sep 05 '18 at 18:53
  • Possible duplicate of [Subsetting a dataframe based on cell color and text color in excel sheet](https://stackoverflow.com/questions/52162444/subsetting-a-dataframe-based-on-cell-color-and-text-color-in-excel-sheet) Specifically, see the 3rd example in my answer. – DeepSpace Sep 05 '18 at 18:58
  • Thanks for suggestion @DeepSpace. It is giving error as updated in ques. – Manvi Sep 05 '18 at 19:09
  • @Manvi That's interesting. Can you please provide the entire stacktrace? – DeepSpace Sep 05 '18 at 19:13
  • @DeepSpace updated stacktrace in question above. – Manvi Sep 05 '18 at 19:20
  • @Manvi I see. There is something with the styling/formatting of the specific file you are trying to open. Let's handle this as a github issue: https://github.com/DeepSpace2/StyleFrame/issues/new Please make sure to paste the same stacktrace and attach a file with a similiar formatting/styling (not with your actual data if it is sensitive) – DeepSpace Sep 05 '18 at 19:35
  • Also if I remove "use_openpyxl_styles" it gives error in cell.style.bg_color as "AttributeError: ("'Style' object has no attribute 'bg_color'", 'occurred at index sheet1') – Manvi Sep 05 '18 at 19:36
  • @DeepSpace I created a issue as "Get data from excel cells which have green color as background #44". – Manvi Sep 06 '18 at 15:54
  • @DeepSpace Can I change StyleFrame object to pandas's dataframe: df = pd.DataFrame(data=StyleFrame.read_excel('filename.xlsx', read_style=True, use_openpyxl_styles=False)) is giving error as :ValueError: DataFrame constructor not properly called! – Manvi Sep 07 '18 at 16:08
  • @Manvi A StyleFrame objects stores the underlying dataframe object as `data_df` attribute, `df = sf.data_df` – DeepSpace Sep 08 '18 at 09:04

1 Answers1

0

Even though it might be possible through Python, the easiest way should be filtering by Color in Excel, copying that Table and pasting it elsewhere, and then importing it as you would with any Excel file with Pandas.

As commented by DeepSpace, it has been donde before through Python but it's quite troublesome

Juan C
  • 5,846
  • 2
  • 17
  • 51
  • That's kinda beats the point of using Python, as it is possible, just not with the libraries OP has tried – DeepSpace Sep 05 '18 at 18:54
  • This is dynamin excel sheet coming in mail from data vendor. I am trying way as suggested by @DeepSpace – Manvi Sep 05 '18 at 18:58