-6

I'm trying to plot from the data Vocabulary.txt education in relation to vocabulary.

This is the code I used

plot(jitter(education)~jitter(vocabulary),pch=23,xlim=c(0,30),ylim=c(0,30))

my graphic looks like this enter image description here

It doesn't look right maybe someone can explain to me what I did wrong and further on what exactly the command jitter does?

YakovL
  • 7,557
  • 12
  • 62
  • 102
J.Goodcat
  • 3
  • 1

1 Answers1

2

I think the two outputs below are ~pub-ready.

The first uses base R and jitter, used to add some noise to data so that points with the very same coordinates appear on different positions. That's a nice approach in such case (providing you mention the jittering as data are slightly modified). If you have many points, you can combine this approach with some transparency.

First of all, we make the example reproducible:

df <- read.table("http://socserv.socsci.mcmaster.ca/jfox/Books/Applied-Regression-3E/datasets/Vocabulary.txt", header=TRUE)
plot(jitter(education)~jitter(vocabulary), df, pch=20, col="#00000011",
     xlim=range(vocabulary), ylim=range(education),
     xlab="vocabulary", ylab="education")

enter image description here

But fundamentally, you may be trying to plot a contingency table, so the second, using ggplot2:

library(ggplot2)
# creates a contingency table
tab.df <- as.data.frame(with(df, table(education, vocabulary)))
ggplot(tab.df) + aes(x=vocabulary, y=education, fill=Freq, label=Freq) + 
# colored tiles and labels (0s are omitted)
geom_tile() + geom_text(data=subset(tab.df, subset = Freq != 0), size=2) +
# cosmectics
scale_fill_gradient(low="white", high="red") + theme_linedraw()

enter image description here

Plotting percentages (both for tiles and labels) may be a better choice but your question was vague about your objectives. And if you want the first plot, but ala ggplot2 you still can work around:

ggplot(df) + aes(x=education, y=vocabulary) + geom_jitter(alpha=0.05)

enter image description here

Vincent Bonhomme
  • 7,235
  • 2
  • 27
  • 38