4

Is there a way to remove columns or rows after applying style in python pandas? And re-sort them?

styled = df.style.apply(colorize, axis=None)
#remove _x columns
yonly = list(sorted(set(styled.columns) - set(df.filter(regex='_x$').columns)))
###Remove columns that end with "_x" here
styled.to_excel('styled.xlsx', engine='openpyxl', freeze_panes=(1,1))

Most things I tried were unavailable, i.e. styled.reindex(columns=yonly) returned AttributeError: 'Styler' object has no attribute 'reindex'

styled.columns = yonly returned AttributeError: 'list' object has no attribute 'get_indexer'

styled = styled[yonly] returns TypeError: 'Styler' object is not subscriptable

Follow-up from Colour specific cells from two columns that don't match, using python pandas style.where (or otherwise) and export to excel

Savvas Radevic
  • 543
  • 1
  • 8
  • 23
  • 1
    I am not sure if it is possible after applying style, why is not possible apply it before? – jezrael Jul 05 '18 at 09:35
  • It continues from https://stackoverflow.com/questions/51175491/colour-specific-cells-from-two-columns-that-dont-match-using-python-pandas-sty/51175719 -- I can't remove before comparing and colouring :) Or at least in my head it doesn't make sense – Savvas Radevic Jul 05 '18 at 09:38
  • I get you. Need remove all non coloring rows and columns, right? – jezrael Jul 05 '18 at 09:42
  • No just the columns that end in "_x" :) – Savvas Radevic Jul 05 '18 at 10:15

3 Answers3

2

After @jezrael's comment to remove columns before styling and colouring, I got my answer :)

The solution was to pass an extra argument, making the original dataframe df available. And coloured the dataframe df_tmp with the "_y" only. :)

df = pd.DataFrame({
    'config_dummy1': ["dummytext"] * 10,
    'a_y': ["a"] * 10,
    'config_size_x': ["textstring"] * 10,
    'config_size_y': ["textstring"] * 10,
    'config_dummy2': ["dummytext"] * 10,
    'a_x': ["a"] * 10
})
df.at[5, 'config_size_x'] = "xandydontmatch"
df.at[9, 'config_size_y'] = "xandydontmatch"
df.at[0, 'a_x'] = "xandydontmatch"
df.at[3, 'a_y'] = "xandydontmatch"
print(df)

def color(x, extra):
    c1 = 'color: #ffffff; background-color: #ba3018'
    df1 = pd.DataFrame('', index=x.index, columns=x.columns)

    #select only columns ends with _x and _y and sorting
    cols = sorted(extra.filter(regex='_x$|_y$').columns)
    #loop by pairs and assign style by mask
    for colx, coly in zip(cols[::2],cols[1::2]):
        #pairs columns 
        m = extra[colx] != extra[coly]
        df1.loc[m, [coly]] = c1
    return df1

yonly = list(sorted(set(df.columns) - set(df.filter(regex='_x$').columns)))
df_tmp = df[yonly]
df_tmp.style.apply(color, axis=None, extra=df).to_excel('styled.xlsx', engine='openpyxl')

Thank you wonderful people of SO! :D

Savvas Radevic
  • 543
  • 1
  • 8
  • 23
1

I'm not sure about re-sorting, but if you only want to remove (hide) some columns use Styler's
hide_columns method.

For example to hide columns 'A' and 'B':

hide_columns(['A', 'B'])
GMR
  • 11
  • 2
1

I had a similar scenario wherein I had to color background of a dataframe based on another dataframe. I created a function for coloring based on the ranges of the other dataframe as follows:

def colval(val, z1):
    color= 'None'
    df1= pd.DataFrame('', index= val.index, columns= val.columns) # dataframe for coloring
    colm= z1.shape
    
    for x in list(range(colm[0])):
        for y in list(range(1, colm[1])):
# check the range in the dependent dataframe
# and color the other one
             if(z1.iloc[x, y]>= 4.5): 
                df1.iloc[x, y]= 'background-color: red'
             elif(z1.iloc[x, y]<= -4.5):
                df1.iloc[x, y]= 'background-color: yellow'      
    return df1
df_tocol.style.apply(colval, axis= None, z1= diff_df)

Hope this helps!

Soorma
  • 11
  • 1