It's important to note that boxplot
and plot
are generic functions that behave differently based on what is passed to them. In this case, because you specify a factor as your x
variable in the plot, it really comes down to comparing
boxplot(rad, crim, log='y')
boxplot(crim ~ as.factor(rad),log='y')
So you are either passing two different parmeters in the first case, or a formula in the second case. These behave very differently. If you don't use a formula, you just get a box plot for each variable you pass in. You can see what happens if you add other column names
boxplot(rad, crim, zn, dis, log='y')
There you can see that you just get a separate box plot for each of the variables you pass in. The "1" is the distribution of the rad
variable for all observations, the "2" is the crim
, and so on.
When you call
boxplot(crim ~ as.factor(rad),log='y')
You are getting a box plot for each unique value of rad
. It's not really possible to add over variables when using the formula syntax.
See the ?boxplot
help page for more details.
Also I should mention it's usually a bad idea to use attach()
. It would be better to the data=
parameter for functions that support it and with()
for functions that do not. For example
with(Boston, boxplot(crim, rad, log="y"))
boxplot(crim~rad, log="y", data=Boston)