3

This is a peak into a large dataset named P, where there are 10 concessionaries (CS) that have different shops (SHP) with several numeric values. The dataset lists them ordered by week (WK) 2 tm 52. It creates a large file. A peak into just the 6 first rows:

WK,MND,CS,SHP,RevCY,RevLY,TCY,TLY,ACY,ALY
=========================================
2,JAN,AAA,AAA Shop 1,16834,16686,1837,1983,2853,3002 

2,JAN,AAA,AAA Shop 2,95919,114696,9742,11813,20521,24673

2,JAN,BBB, BBB shop 1,93428,92212,7647,7857,18436,17984

2,JAN,BBB, BBB Shop 2,30600,35831,2748,3063,5579,6408

2,JAN,CCC, CCC Shop 1, 65229,78761,6074,7172,13852,16706

2,JAN,CCC, CCC Shop 2,465,754,73,118,92,162

I have difficulties plotting just the values that concern fi SHP==AAA.

p <- ggplot(P, aes(WK, RevCY)) + geom_bar(stat="identity")

This is plotting all shops and all CS. So the underlying question is to understand how I can plot only the shops (SHP) from CS=AAA. Let's say with the weeks (WK) on the x-axis and RevCY on the y-axis in the ggplot() + geom_bar(stat="identity") code.

Is this the right direction?:

p <- ggplot(P[P$CS=="AAA"], aes(WK, RevCY)) + geom_bar(stat="identity")

So without creating all kinds of subsets and straight into the ggplot() code. Hope my question is clear.

ad_s
  • 1,560
  • 4
  • 15
  • 16

2 Answers2

9

Does this help you?

ggplot(t, aes(WK, RevCY)) + geom_bar(data=subset(t,CS=="AAA"),stat="identity")
Florin
  • 196
  • 8
4

To extract certain rows from your data frame t, you have to use

t[t$CS == "AAA", ]

instead of t[t$CS == "AAA"]. The latter syntax is used to select columns.

The plot command:

p <- ggplot(t[t$CS == "AAA", ], aes(WK, RevCY)) + geom_bar(stat = "identity")

I suppose you want to add some arguments to produce multiple bars per WK instead of a single stacked bar:

p <- ggplot(t[t$CS == "AAA", ], aes(as.factor(WK), RevCY)) + 
       geom_bar(stat = "identity", aes(group = RevCY), position = "dodge")
Sven Hohenstein
  • 80,497
  • 17
  • 145
  • 168
  • thanks. Bur if I want to select 2 or more CS? The following won't work: >ggplot(t[t$CS == "AAA"& "BBB", ], aes(WK, RevCY)) + geom_bar(stat = "identity", aes(group = RevCY), position = "dodge") – ad_s Jan 23 '14 at 14:59
  • @user3214508 If you want to compare a variable with multiple values, you have to use `%in%`. For your example: `t$CS %in% c("AAA", "BBB")`. – Sven Hohenstein Jan 23 '14 at 15:29
  • If i Type: >ggplot(t[t$CS %in% C("AAA","BBB"), ], aes(WK, RevCY)) + geom_bar(stat="identity", aes(group=RevCY), position="dodge") I get an error: "Error in C("AAA", "BBB") : object not interpretable as a factor" . – ad_s Jan 26 '14 at 11:09
  • @user3214508 Why did you use a capital `C` when my code includes a lowercase `c`? – Sven Hohenstein Jan 26 '14 at 14:21
  • I meant c. Typing >ggplot(t[t$CS %in% c("AAA","BBB"), ], aes(WK, RevCY)) + geom_bar(stat="identity", aes(group=RevCY), position="dodge") gives error: Error: unexpected '>' in ">" – ad_s Jan 26 '14 at 14:42
  • @user3214508 Try `ggplot(t[t$CS %in% c("AAA","BBB"), ], aes(WK, RevCY)) + geom_bar(stat="identity", aes(group=RevCY), position="dodge")`. – Sven Hohenstein Jan 26 '14 at 15:07
  • @sven, your solution is using base-r, right? Do you think it's faster than subset() when used within a ggplot? – Dan Dec 15 '17 at 03:26