2

I've made a plot of variables lsiete and lcinco with the following code:

qplot(lsiete, lcinco, data=enc, color=LENGTHE)

enter image description here

However, I also want to give colour to the scatter plot by the following factor variables to visualize it all at the same time:

> names(enc[,ind])
 [1] "SEX"      "RACE"     "MSTATUS"  "EDUC"     "POSITION" "SATSCHED" "TYPESCH"  "FLEX"     "URBRUR"   "HOURS"   
[11] "SCHOOL"   "ANJOB"    "TYPERES"  "LENGTHE"  "HOWLONG"  "REASONQ"  "REASONW"  "WHY" 

So, I want a panel with all the possible scatter plots with the mentioned condition.

How Can I write the code to do that?

EDIT: To be more clear, lsiete and lcinco doesnt change, instead the variables in color change

EDIT 2: To give a reproducible example. I create the following data frame with random data:

sn <- data.frame(a=rnorm(100),b=rnorm(100), cat1=sample(c('male', 'female'), 100, replace=TRUE),cat2=sample(c('U', 'AL'), 100, replace=TRUE),cat3=sample(c('AR', 'ML'), 100, replace=TRUE),cat4=sample(c('LM', 'KR'), 100, replace=TRUE))

I can create a qplot with a and b, and give colour according to cat2:

qplot(a,b,data=sn,color=cat2)

But, I want to keep a and b always, and give colour according the rest of categorical variables to have a panel of all possible scatterplots.


CreamStat
  • 2,155
  • 6
  • 27
  • 43
  • We need to prepare the data for ggplot by converting it to from wide to long. See [this post for more info](http://stackoverflow.com/questions/1181060), then plot. Also, it would be nice to give us a [toy example data](http://stackoverflow.com/questions/5963269) for testing. – zx8754 Sep 05 '16 at 06:10

1 Answers1

3

The easiest way is to reshape your data and use facetting. This will create a single plot with four panels and a shared legend.

sn2 <- tidyr::gather(sn, 'cat', 'col', cat1:cat1.1)

ggplot(sn2, aes(a, b, col = col)) + geom_point() + facet_wrap(~cat)

enter image description here

Alternatively, if you would prefer separate legends you'll need to create four plots and stitch them together, like so:

plot_fun <- function(cat) {
  ggplot(sn, aes_(~a, ~b, col = cat)) + geom_point()
}

plot_list <- lapply(c(~cat1, ~cat2, ~cat3, ~cat1.1), plot_fun)
cowplot::plot_grid(plotlist = plot_list, align = 'hv')

enter image description here

Axeman
  • 32,068
  • 8
  • 81
  • 94
  • Your second code is nice but I have problems with categorical variables because in real data there are a lot of categorical variables and I don't know how to set the string I created as parameter. For example, I created: this categories: [1] "~SEX" "~RACE" "~MSTATUS" "~EDUC" "~POSITION" "~SATSCHED" "~TYPESCH" "~FLEX" "~URBRUR" [10] "~HOURS" "~SCHOOL" "~ANJOB" "~TYPERES" "~LENGTHE" "~HOWLONG" "~REASONQ" "~REASONW" "~WHY" , but this characters are not recognized like categorical variables in ggplot. How Can I fix that? – CreamStat Sep 05 '16 at 08:35
  • I can't really tell, since my code works on your example data. Are those variables factors or character vectors in you data.frame? Or are they numeric? What output do you get? – Axeman Sep 05 '16 at 08:37
  • They are factor variables, the problem is in lapply(c(~cat1, ~cat2, ~cat3, ~cat1.1), plot_fun), I can't write manually more than 100 hundred factor variables like that. – CreamStat Sep 05 '16 at 08:41
  • 1
    Replace `aes_(~a, ~b, col = cat)` with `aes_string('a', 'b', col = cat)` and use `names(data)` to get your names. – Axeman Sep 05 '16 at 08:47