graphs overlapping and redundant code to clear it out

Question

I've been using RMarkdown to create graphs. Then I take the graphs and copy and paste them into Powerpoint presentations. That's been my workflow.

Here is the dataframe that I am using.

{'Unnamed: 0': {0: 'Mazda RX4', 1: 'Mazda RX4 Wag', 2: 'Datsun 710', 3: 'Hornet 4 Drive', 4: 'Hornet Sportabout', 5: 'Valiant', 6: 'Duster 360', 7: 'Merc 240D', 8: 'Merc 230', 9: 'Merc 280', 10: 'Merc 280C', 11: 'Merc 450SE', 12: 'Merc 450SL', 13: 'Merc 450SLC', 14: 'Cadillac Fleetwood', 15: 'Lincoln Continental', 16: 'Chrysler Imperial', 17: 'Fiat 128', 18: 'Honda Civic', 19: 'Toyota Corolla', 20: 'Toyota Corona', 21: 'Dodge Challenger', 22: 'AMC Javelin', 23: 'Camaro Z28', 24: 'Pontiac Firebird', 25: 'Fiat X1-9', 26: 'Porsche 914-2', 27: 'Lotus Europa', 28: 'Ford Pantera L', 29: 'Ferrari Dino', 30: 'Maserati Bora', 31: 'Volvo 142E'}, 'mpg': {0: 21.0, 1: 21.0, 2: 22.8, 3: 21.4, 4: 18.7, 5: 18.1, 6: 14.3, 7: 24.4, 8: 22.8, 9: 19.2, 10: 17.8, 11: 16.4, 12: 17.3, 13: 15.2, 14: 10.4, 15: 10.4, 16: 14.7, 17: 32.4, 18: 30.4, 19: 33.9, 20: 21.5, 21: 15.5, 22: 15.2, 23: 13.3, 24: 19.2, 25: 27.3, 26: 26.0, 27: 30.4, 28: 15.8, 29: 19.7, 30: 15.0, 31: 21.4}, 'cyl': {0: 6, 1: 6, 2: 4, 3: 6, 4: 8, 5: 6, 6: 8, 7: 4, 8: 4, 9: 6, 10: 6, 11: 8, 12: 8, 13: 8, 14: 8, 15: 8, 16: 8, 17: 4, 18: 4, 19: 4, 20: 4, 21: 8, 22: 8, 23: 8, 24: 8, 25: 4, 26: 4, 27: 4, 28: 8, 29: 6, 30: 8, 31: 4}, 'disp': {0: 160.0, 1: 160.0, 2: 108.0, 3: 258.0, 4: 360.0, 5: 225.0, 6: 360.0, 7: 146.7, 8: 140.8, 9: 167.6, 10: 167.6, 11: 275.8, 12: 275.8, 13: 275.8, 14: 472.0, 15: 460.0, 16: 440.0, 17: 78.7, 18: 75.7, 19: 71.1, 20: 120.1, 21: 318.0, 22: 304.0, 23: 350.0, 24: 400.0, 25: 79.0, 26: 120.3, 27: 95.1, 28: 351.0, 29: 145.0, 30: 301.0, 31: 121.0}, 'hp': {0: 110, 1: 110, 2: 93, 3: 110, 4: 175, 5: 105, 6: 245, 7: 62, 8: 95, 9: 123, 10: 123, 11: 180, 12: 180, 13: 180, 14: 205, 15: 215, 16: 230, 17: 66, 18: 52, 19: 65, 20: 97, 21: 150, 22: 150, 23: 245, 24: 175, 25: 66, 26: 91, 27: 113, 28: 264, 29: 175, 30: 335, 31: 109}, 'drat': {0: 3.9, 1: 3.9, 2: 3.85, 3: 3.08, 4: 3.15, 5: 2.76, 6: 3.21, 7: 3.69, 8: 3.92, 9: 3.92, 10: 3.92, 11: 3.07, 12: 3.07, 13: 3.07, 14: 2.93, 15: 3.0, 16: 3.23, 17: 4.08, 18: 4.93, 19: 4.22, 20: 3.7, 21: 2.76, 22: 3.15, 23: 3.73, 24: 3.08, 25: 4.08, 26: 4.43, 27: 3.77, 28: 4.22, 29: 3.62, 30: 3.54, 31: 4.11}, 'wt': {0: 2.62, 1: 2.875, 2: 2.32, 3: 3.215, 4: 3.44, 5: 3.46, 6: 3.57, 7: 3.19, 8: 3.15, 9: 3.44, 10: 3.44, 11: 4.07, 12: 3.73, 13: 3.78, 14: 5.25, 15: 5.424, 16: 5.345, 17: 2.2, 18: 1.615, 19: 1.835, 20: 2.465, 21: 3.52, 22: 3.435, 23: 3.84, 24: 3.845, 25: 1.935, 26: 2.14, 27: 1.513, 28: 3.17, 29: 2.77, 30: 3.57, 31: 2.78}, 'qsec': {0: 16.46, 1: 17.02, 2: 18.61, 3: 19.44, 4: 17.02, 5: 20.22, 6: 15.84, 7: 20.0, 8: 22.9, 9: 18.3, 10: 18.9, 11: 17.4, 12: 17.6, 13: 18.0, 14: 17.98, 15: 17.82, 16: 17.42, 17: 19.47, 18: 18.52, 19: 19.9, 20: 20.01, 21: 16.87, 22: 17.3, 23: 15.41, 24: 17.05, 25: 18.9, 26: 16.7, 27: 16.9, 28: 14.5, 29: 15.5, 30: 14.6, 31: 18.6}, 'vs': {0: 0, 1: 0, 2: 1, 3: 1, 4: 0, 5: 1, 6: 0, 7: 1, 8: 1, 9: 1, 10: 1, 11: 0, 12: 0, 13: 0, 14: 0, 15: 0, 16: 0, 17: 1, 18: 1, 19: 1, 20: 1, 21: 0, 22: 0, 23: 0, 24: 0, 25: 1, 26: 0, 27: 1, 28: 0, 29: 0, 30: 0, 31: 1}, 'am': {0: 1, 1: 1, 2: 1, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 0, 13: 0, 14: 0, 15: 0, 16: 0, 17: 1, 18: 1, 19: 1, 20: 0, 21: 0, 22: 0, 23: 0, 24: 0, 25: 1, 26: 1, 27: 1, 28: 1, 29: 1, 30: 1, 31: 1}, 'gear': {0: 4, 1: 4, 2: 4, 3: 3, 4: 3, 5: 3, 6: 3, 7: 4, 8: 4, 9: 4, 10: 4, 11: 3, 12: 3, 13: 3, 14: 3, 15: 3, 16: 3, 17: 4, 18: 4, 19: 4, 20: 3, 21: 3, 22: 3, 23: 3, 24: 3, 25: 4, 26: 5, 27: 5, 28: 5, 29: 5, 30: 5, 31: 4}, 'carb': {0: 4, 1: 4, 2: 1, 3: 1, 4: 2, 5: 1, 6: 4, 7: 2, 8: 2, 9: 4, 10: 4, 11: 3, 12: 3, 13: 3, 14: 4, 15: 4, 16: 4, 17: 1, 18: 2, 19: 1, 20: 1, 21: 2, 22: 2, 23: 4, 24: 2, 25: 1, 26: 2, 27: 2, 28: 4, 29: 6, 30: 8, 31: 2}}

The code looks like this.

```{r, warning = FALSE, message = FALSE}

ggplot2::ggplot(data = mtcars, aes(x = wt, y = after_stat(count))) + 
  geom_histogram(bins = 32, color = 'black', fill = '#ffe6b7') +
  labs(title = "Mtcars", subtitle = "Histogram") +
  theme(plot.title = element_text(face = "bold")) 

ggplot2::ggplot(data = mtcars, aes(x = mpg, y = after_stat(count))) + 
  geom_histogram(bins = 32, color = 'black', fill = '#ffe6b7') +
  labs(title = "Mtcars", subtitle = "Histogram") +
  theme(plot.title = element_text(face = "bold")) 

ggplot2::ggplot(data = mtcars, aes(x = disp, y = after_stat(count))) + 
  geom_histogram(bins = 32, color = 'black', fill = '#ffe6b7') +
  labs(title = "Mtcars", subtitle = "Histogram") +
  theme(plot.title = element_text(face = "bold")) 

```

And here is a screenshot of the output.

Now I'm trying to do the same using python graphs. I'm seeing that I can't do the same thing exactly because the graphs start overlapping.

```{python}

seaborn.histplot(data=mtcars, x="wt", bins = 30)
plt.title("wt histogram", loc = 'left')
plt.show()

seaborn.histplot(data=mtcars, x="mpg", bins = 30)
plt.title("mpg histogram", loc = 'left')
plt.show()

seaborn.histplot(data=mtcars, x="disp", bins = 30)
plt.title("disp histogram", loc = 'left')
plt.show()


```

So now what I'm doing is I'm clearing out the space after I create every single graph. The output now looks fine - I get a distinct histogram for each variable I'm calling.

```{python}

plt.figure().clear()
plt.close()
plt.cla()
plt.clf()

seaborn.histplot(data=mtcars, x="wt", bins = 30)
plt.title("wt histogram", loc = 'left')
plt.show()

plt.figure().clear()
plt.close()
plt.cla()
plt.clf()

seaborn.histplot(data=mtcars, x="mpg", bins = 30)
plt.title("mpg histogram", loc = 'left')
plt.show()

plt.figure().clear()
plt.close()
plt.cla()
plt.clf()

seaborn.histplot(data=mtcars, x="disp", bins = 30)
plt.title("disp histogram", loc = 'left')
plt.show()


```

The output is definitely better.

But isn't this method really redundant? What do people who use python more regularly do to maintain what is happening with the graphs? Do you all clear out the space every time in this way?

I'm voting to close this because the issue is not reproducible using seaborn 0.11.2 and matplotlib 3.5.1 [plot and code](https://i.stack.imgur.com/2cY31.png) — Trenton McKinney, Apr 29 '22 at 15:02
Hi @TrentonMcKinney I just added the data frame that I am using to create all these graphs! I believe this post is reproducible now. Thanks for pointing out that I did not include the table — hachiko, Apr 29 '22 at 15:10
The issue you are having is not reproducible even with you data. [Plot and code](https://i.stack.imgur.com/jR2z7.png). The plots are generated without issues. — Trenton McKinney, Apr 29 '22 at 15:14
Anyway, I would probable use this [code](https://i.stack.imgur.com/1unRn.png) instead. Add `ec='k'` to the plot call to include edge color on the bars. — Trenton McKinney, Apr 29 '22 at 15:19
Also note that `plt.figure()` creates a fully new, empty figure. So, calling `plt.figure().clear()` doesn't make much sense. `plt.close()` just closes the figure, a new one will be created automatically when a new plot command is encountered. Preferably, you just call `plt.figure()` and leave out the 4 other calls. Depending on the backend, `plt.show()` will also clear the plot (after showing it). — JohanC, Apr 29 '22 at 15:28
@TrentonMcKinney It doesn't look to me like plt.show(ec='k') did anything? — hachiko, Apr 29 '22 at 15:34
`mtcars[col].plot(kind='hist', bins=30, figsize=(6, 4), ec='k')` — Trenton McKinney, Apr 29 '22 at 15:34
@TrentonMcKinney unfortunately that for loop just puts the histograms next to each other in a crazy way. I'm sure I'm doing something wrong — hachiko, Apr 29 '22 at 15:36
[Code and plots](https://i.stack.imgur.com/MJx5x.png) in Jupyter — Trenton McKinney, Apr 29 '22 at 15:39
@JohanC thank you for clarifying the plt.figure and other items. I was looking at this SO post to understand when to use the difference versions but I'm wondering do these functions still apply even though I'm using Seaborn and not Matplotlib because the title of the post says Matplotlib https://stackoverflow.com/questions/8213522/when-to-use-cla-clf-or-close-for-clearing-a-plot-in-matplotlib — hachiko, Apr 29 '22 at 15:41
Seaborn is an api for matplotlib. If you're just copying these to powerpoint it would probably be easier to save the images. with `plt.savefig(f'hist_{col}.png')` before `plt.show()`. — Trenton McKinney, Apr 29 '22 at 15:43
My experience is really in R and I've used it offline with csv files saved locally, so comments like "depending on the backend" are really confusing to me — hachiko, Apr 29 '22 at 15:51
The "backend" is a library that is just to show the plot. It depends on how you are running matplotlib, and whether you are running stand-alone code or use it interactively. E.g. in a Jupyter notebook you can choose whether you just want plain plots inside the notebook, with or without interaction (no zooming, etc). Most things work the same; the moment something is displayed can be different. `plt.show()` usually shows the plot in an interactive window, and clears everything internally when that window is closed. — JohanC, Apr 29 '22 at 15:59
Ah ok, that makes sense. I'm actually using python in RStudio with library(reticulate) because I like the RStudio IDE and I think it's the same, code-wise, more or less — hachiko, Apr 29 '22 at 16:01
That's probably why you're seeing the weird issues. If you're going to use python I recommend using jupyter lab, which is already installed if you have conda (anaconda). — Trenton McKinney, Apr 29 '22 at 16:14

graphs overlapping and redundant code to clear it out

0 Answers0