1

I'm taking this data vis class in which the professor has us basically copying and pasting code instead of teaching us anything. I'm trying to figure out how to create a scatter plot which illustrates the strike rate and civilian casualties of drone warfare.
The problem I'm having is how to use a variable from the data to dictate the color of a data point. I want to minimally use the "status" (dead/2, alive/1) to color the points. It'd be ideal if I could figure out how to color the points based upon the drone target's nationality, too, since I have data for that. Anyway, this is what I have so far. It creates the points, but not the colors. I'd like to know how to create the colors.

symbols(killVStarget$name, killVStarget$strikes, 
        circles=sqrt(killVStarget$casualties), 
        col=ifelse(killVStarget$status==2, "red", "black"), cex=0.15)

I imported the data from a .csv file. Here are the first 10 entries copied from excel:

name                   nationality  status  strikes  casualties
baitullah mehsud       pakistani    2       7        164
qari hussain           pakistani    2       6        128
abu ubaidah al masri   pakistani    2       3        120
mullah sangeen zadran  pakistani    2       3        108
ayman al-zawahiri      pakistani    1       2        105
sirajudin haqqani      pakistani    1       5        82
hakimullah mehsud      pakistani    2       5        68
sadiq noor             pakistani    2       4        57
said al-shihri         yemeni       2       4        57
resueman
  • 10,572
  • 10
  • 31
  • 45
Smeaux
  • 71
  • 1
  • 2
  • 6

1 Answers1

1
df <- data.frame(name =  c("baitullah mehsud    pakistani", "qari hussain pakistani", "abu ubaidah al masri    pakistani", "mullah sangeen zadran   pakistani",
                           "ayman al-zawahiri   pakistani", "sirajudin haqqani   pakistani", "hakimullah mehsud   pakistani", "sadiq noor  pakistani",
                           "said al-shihri  yemeni "), strikes = c(7, 6, 3, 3, 2, 5, 5, 4, 4), status = c(2, 2, 2, 2, 1, 1, 2, 2, 2),
                           casualities = c(164, 128, 120, 108, 105, 82, 68, 57, 57)
                           ) 



library(ggplot2)
 ggplot(aes(x = name, y = strikes, size = casualities, color = factor(status)), data = df) + geom_point()

enter image description here

ggplot(aes(x = strikes, y = name, size = casualities, color = factor(status)), data = df) + geom_point()

enter image description here

Mateusz1981
  • 1,817
  • 17
  • 33
  • I would swap `name` and `strikes` on the x and y-axis. This prevents the overlapping labels on the x-axis without having to rotate them (making them hard to read). – Paul Hiemstra Oct 12 '15 at 13:49