1

This is an excerpt of one of my data tables:

YR    POPSTAT    Freq
2001       0       34
2002       0       45
2003       0       32
2015       0       16
2001       1        7
2002       1       11
2003       1        8
2014       1        7
2015       1        3

I want to plot an histogram of frequencies per year (YR) for all POPSTAT>0. None of these obviously do the job:

for (POPSTAT>0) barplot(table.popstat$Freq)
plot(table.popstat$Freq~table.popstat$YR)

I also need the years to be on the x-axis. Any help would be appreciated.

Vincent Bonhomme
  • 7,235
  • 2
  • 27
  • 38
Dag
  • 569
  • 2
  • 5
  • 20
  • Is hist(table.popstat$Freq[table.popstat$POPSTAT > 0]) what you want? – Richard Telford Apr 16 '16 at 10:56
  • Error in hist.default(table.popstat$Freq[table.popstat$POPSTAT > 0]) : invalid number of 'breaks' In addition: Warning message: In Ops.factor(table.popstat$POPSTAT, 0) : ‘>’ not meaningful for factors – Dag Apr 16 '16 at 11:11
  • So POPSTAT is factor. Try hist(table.popstat$Freq[table.popstat$POPSTAT != 0]) – Richard Telford Apr 16 '16 at 11:14
  • I converted the POPSTAT value to Factor, and I got a historgram, but I didn't get YEAR (YR) as the x-axis... – Dag Apr 16 '16 at 11:17

3 Answers3

1

If you want to split the data by year, you will need either to use a loop or ggplot

require(ggplot2)

ggplot(subset(table.popstat, POPSTAT != 0), aes(x = Freq)) + geom_histogram() + facet_wrap(~YR)

You might also consider using a boxplot

ggplot(subset(table.popstat, POPSTAT != 0), aes(x = YR, y = Freq)) + geom_boxplot()
Richard Telford
  • 9,558
  • 6
  • 38
  • 51
1

There are two ways to do this, but first you will need to get the subset of data that fits your criterion (POPSTAT>0).

Getting the data in R:

plotdata <- dput(plotdata)
structure(list(YR = c(2001L, 2002L, 2003L, 2015L, 2001L, 2002L, 
2003L, 2014L, 2015L), POPSTAT = c(0L, 0L, 0L, 0L, 1L, 1L, 1L, 
1L, 1L), Freq = c(34L, 45L, 32L, 16L, 7L, 11L, 8L, 7L, 3L)), .Names = c("YR", 
"POPSTAT", "Freq"), class = "data.frame", row.names = c(NA, -9L
))

Getting the subset:

plt_df <- subset(plotdata,POPSTAT>0,select = c(1,3)) #You only want the Year and Freq columns

Plotting the graph:

Base R

bplot <- barplot(plt_df$Freq, plt_df$YR, ylim = c(0,13),axes=F)
axis(1,at=bplot,labels=plt_df$YR)
axis(2,seq(0,15,3),c(0,3,6,9,12,15))

ggplot package

install.packages('ggplot2')
library(ggplot2)
ggplot(plt_df, aes(x=YR,y=Freq)) + geom_bar(stat='identity')

Hope it helps.

Abdou
  • 12,931
  • 4
  • 39
  • 42
1

This can be done in two steps: filtering, plotting, and in a single line:

with(subset(df, POPSTAT > 0), barplot(Freq, names.arg=YR))

If you prefer ggplot2:

library(ggplot2)
ggplot(subset(df, POPSTAT > 0)) + aes(x=YR, y=Freq) + geom_bar(stat='identity')

Here is a dput of your data, so that your example is reproducible.

df <- structure(list(YR = c(2001L, 2002L, 2003L, 2015L, 2001L, 2002L, 
      2003L, 2014L, 2015L), POPSTAT = c(0L, 0L, 0L, 0L, 1L, 1L, 1L, 
      1L, 1L), Freq = c(34L, 45L, 32L, 16L, 7L, 11L, 8L, 7L, 3L)), 
     .Names = c("YR", "POPSTAT", "Freq"),
     class = "data.frame", row.names = c(NA, -9L))
Community
  • 1
  • 1
Vincent Bonhomme
  • 7,235
  • 2
  • 27
  • 38
  • Thank you all. Ended up using: sub1=subset(table.popstat,POPSTAT==1,select=c(1,3)) ggplot(sub1, aes(x=YR,y=Freq)) + geom_bar(stat='identity') – Dag Apr 16 '16 at 18:46