1

I'm fairly new to programming and trying to highlight specific cell within a dataframe but getting the error AttributeError: 'list' object has no attribute 'rstrip'. Not really sure how I can solve this.

data = {'country': ['US', 'US', 'China', 'India', 'US', 'India'], 
        'car_number': ['X123-00001C', 'X123-00002C', 'X123-00003C', 'X123-00004C', 'X123-00004', '']}  
  
df = pd.DataFrame(data)  

def color_in(val):
    highlight = 'background-color: orange;'
    default = ''
    if  str(~df['car_number'].str.match(r'^X\d\d\d[-]\d\d\d\d\d[C]$')) in val:
        return [highlight, default]
    else:
        return [default, default]

df.style.applymap(color_in, subset=['car_number'])
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/opt/conda/lib/python3.8/site-packages/IPython/core/formatters.py in __call__(self, obj)
    343             method = get_real_method(obj, self.print_method)
    344             if method is not None:
--> 345                 return method()
    346             return None
    347         else:

/opt/conda/lib/python3.8/site-packages/pandas/io/formats/style.py in _repr_html_(self)
    203         Hooks into Jupyter notebook rich display system.
    204         """
--> 205         return self.render()
    206 
    207     @doc(

/opt/conda/lib/python3.8/site-packages/pandas/io/formats/style.py in render(self, **kwargs)
    617         * table_attributes
    618         """
--> 619         self._compute()
    620         # TODO: namespace all the pandas keys
    621         d = self._translate()

/opt/conda/lib/python3.8/site-packages/pandas/io/formats/style.py in _compute(self)
    703         r = self
    704         for func, args, kwargs in self._todo:
--> 705             r = func(self)(*args, **kwargs)
    706         return r
    707 

/opt/conda/lib/python3.8/site-packages/pandas/io/formats/style.py in _applymap(self, func, subset, **kwargs)
    808         subset = non_reducing_slice(subset)
    809         result = self.data.loc[subset].applymap(func)
--> 810         self._update_ctx(result)
    811         return self
    812 

/opt/conda/lib/python3.8/site-packages/pandas/io/formats/style.py in _update_ctx(self, attrs)
    649                 if not c:
    650                     continue
--> 651                 c = c.rstrip(";")
    652                 if not c:
    653                     continue

AttributeError: 'list' object has no attribute 'rstrip'

Here is the error I am getting. I would like to highlight everything in the column that doesn't match the expression. Maybe there is a better way?

shuynh84
  • 59
  • 8

1 Answers1

1

You can use

import pandas as pd
data = {'country': ['US', 'US', 'China', 'India', 'US', 'India'], 
        'car_number': ['X123-00001C', 'X123-00002C', 'X123-00003C', 'X123-00004C', 'X123-00004', '']}  

df = pd.DataFrame(data)
import re

def _color_orange(val):
    bgcolor = 'auto'
    color = 'auto'
    if re.match(r'^X\d{3}-\d{5}C$', val):
        bgcolor = 'orange'
        color = 'black'
    return f'background-color: {bgcolor}; color: {color}'

df = df.style.applymap(_color_orange, subset=["car_number"])

Output:

enter image description here

Here, the bgcolor and color variables are set to "auto" and "auto" by default.

Theif re.match(r'^X\d{3}-\d{5}C$', val) checks if the whole string starts with an X, then contains three digits, -, five digits, and C at the end. If it is true, the bgcolor and color variables are set to "orange" and "black".

The properties you need to change are called background-color (to set the background color) and color (for setting font color) (these are HTML attribute names in the Jupyter notebook).

The df.style.applymap(_color_orange, subset=["car_number"]) applies the coloring to the "car_number" column only.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • how would I go about highlighting the background instead? – shuynh84 Mar 17 '22 at 20:37
  • @shuynh84 I added some more code and explanation. – Wiktor Stribiżew Mar 17 '22 at 20:44
  • Is there a command opposite of match? I would like to highlight everything that doesn't match that format – shuynh84 Mar 17 '22 at 20:45
  • @shuynh84 It is really simple, use `not`: `if not re.match(r'^X\d{3}-\d{5}C$', val):`. Yes, you may also use a regex: ``if re.search(r'^(?!X\d{3}-\d{5}C$)', val):`` (note I switched to `re.search` here) – Wiktor Stribiżew Mar 17 '22 at 20:47
  • oh...didn't know there was an if not lol. Thanks @Wiktor Stribiżew – shuynh84 Mar 17 '22 at 20:52
  • is there a way for me to specify only the column car_number? – shuynh84 Mar 17 '22 at 20:53
  • @shuynh84 I added more code to account for that and fixed the default color scheme with `auto` values. – Wiktor Stribiżew Mar 17 '22 at 21:00
  • @shuynh84 I guess [this question](https://stackoverflow.com/questions/71498543) can now be removed. – Wiktor Stribiżew Mar 17 '22 at 21:16
  • what does the f in f'background-color: {bgcolor}; color: {color}' means? – shuynh84 Mar 17 '22 at 21:16
  • @shuynh84 An interpolated f-string literal, see [Python 3's f-Strings: An Improved String Formatting Syntax (Guide)](https://realpython.com/python-f-strings/), [f-strings in Python](https://www.geeksforgeeks.org/formatted-string-literals-f-strings-python/) and the SO [String formatting: % vs. .format vs. f-string literal](https://stackoverflow.com/q/5082452/3832970) thread. In f-strings, you can use `{...}` to insert code or just variables. – Wiktor Stribiżew Mar 17 '22 at 21:17
  • yessir. The other question can be removed. – shuynh84 Mar 17 '22 at 21:18