5

Currently I'm trying to visualize some data I am working on with seaborn. I need to use a comma as decimal separator, so I was thinking about simply changing the locale. I found this answer to a similar question, which sets the locale and uses matplotlib to plot some data.

This also works for me, but when using seaborn instead of matplotlib directly, it doesn't use the locale anymore. Unfortunately, I can't find any setting to change in seaborn or any other workaround. Is there a way?

Here some exemplary data. Note that I had to use 'german' instead of "de_DE". The xlabels all use the standard point as decimal separator.

import locale
# Set to German locale to get comma decimal separator
locale.setlocale(locale.LC_NUMERIC, 'german')

import pandas as pd
import seaborn as sns

import matplotlib.pyplot as plt
# Tell matplotlib to use the locale we set above
plt.rcParams['axes.formatter.use_locale'] = True

df = pd.DataFrame([[1,2,3],[4,5,6]]).T
df.columns = [0.3,0.7]

sns.boxplot(data=df)

Exemplary boxplot with seaborn

KorbenDose
  • 797
  • 1
  • 7
  • 23
  • Tried something similar (see [this](https://stackoverflow.com/questions/49445935/change-decimal-point-to-comma-in-matplotlib-plot?rq=1) answer). Unfortunately that didn't work either. – KorbenDose Oct 01 '18 at 12:09
  • This does not seem to be a seaborn issue to me. While using `plt.plot` happily formats its labels according to the chosen locale, calling the matplotlib function directly via `plt.boxplot(df.T, positions=list(df.columns))` ignores the locale. So it appears to me that is due to the way the `positions` keyword is handled in matplotlib. – jdamp Oct 01 '18 at 12:12
  • 1
    Without really solving the underlying problem, a stupid way to get the numbers with commas on the x-axis is to set `df.columns` to appropriate strings: `df.columns = ["0,3", "0,7"]` – jdamp Oct 01 '18 at 12:17
  • @jdamp That's unfortunate. Really seems like the fastest and easiest way is to convert the numbers to the appropriate string. Thanks! – KorbenDose Oct 01 '18 at 12:18
  • 2
    @jdamp seaborn rather calls something like `plt.boxplot(df.T, positions=range(len(df.columns)), labels=df.columns)`, so it's not the positions, but the labels which are relevant here. Apart, the analysis is pretty much correct. – ImportanceOfBeingErnest Oct 01 '18 at 12:50

1 Answers1

2

The "numbers" shown on the x axis for such boxplots are determined via a matplotlib.ticker.FixedFormatter (find out via print(ax.xaxis.get_major_formatter())). This fixed formatter just puts labels on ticks one by one from a list of labels. This makes sense because your boxes are positionned at 0 and 1, yet you want them to be labeled as 0.3, 0.7. I suppose this concept becomes clearer when thinking about what should happen for a dataframe with df.columns=["apple","banana"].

So the FixedFormatter ignores the locale, because it just takes the labels as they are. The solution I would propose here (although some of those in the comments are equally valid) would be to format the labels yourself.

ax.set_xticklabels(["{:n}".format(l) for l in df.columns]) 

The n format here is just the same as the usual g, but takes into account the locale. (See python format mini language). Of course using any other format of choice is equally possible. Also note that setting the labels here via ax.set_xticklabels only works because of the fixed locations used by boxplot. For other types of plots with continuous axes, this would not be recommended, and instead the concepts from the linked answers should be used.

Complete code:

import locale
# Set to German locale to get comma decimal separator
locale.setlocale(locale.LC_NUMERIC, 'german')

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

df = pd.DataFrame([[1,2,3],[4,5,6]]).T
df.columns = [0.3,0.7]

ax = sns.boxplot(data=df)
ax.set_xticklabels(["{:n}".format(l) for l in df.columns])

plt.show()

enter image description here

ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712