2

I'm completely new to R and I've been trying to cut a dataset into the fields that interest me then plot a barplot() of that.

The thing is, I've ran into errors and I cannot continue.

This is my code:

data = infert;                      # Get a local copy of infert so we can edit stuff.

data <- data[data$case == 0, ];     # Split the data to those that interest us,
                                    # in this case, rows with column 'case' == 0.

data.freq = table(data);            # Plot the graph.
barplot(data.freq); 

The error I get when I source+run the script:

Error in barplot.default(data.freq) : 
  'height' must be a vector or a matrix

Which I guess is because the data matrix comes out X*1 instead of X*N? Or it misses a dimension somewhere else, due to me doing data[data$case == 0, ]?

In any case, how can I get around that and plot a frequency graph of the infert data where infert$case == 0?

Additionally, are there any simplistic ways to plot the relative frequency graph?

Filburt
  • 17,626
  • 12
  • 64
  • 115
Dimitris Sfounis
  • 2,400
  • 4
  • 31
  • 46
  • Help us help you. See http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example on how to give us a minimal reproducible example. This is not as much for us, but for you to start thinking about the problem in a different way. I solve most of my problems by making a small example. – Roman Luštrik Jan 12 '14 at 13:44
  • 1
    @RomanLuštrik infert is a included dataset in R, it's not something I built. Just type `infert` and you'll see it's a 286x8 matrix. I believe my example is elaborate enough. – Dimitris Sfounis Jan 12 '14 at 13:55
  • 1
    Noted, thank you. The problem is that you're trying to plot an 8 dimensional data (`data.freq`). – Roman Luštrik Jan 12 '14 at 15:46
  • Vre Dimitri, what, exactly, are you trying to visualize? I'm a bit confused seeing your `data.freq`. Also, seeing your comment in the accepted answer, note that `data$education` is a 'factor' (i.e. categorical variable); you might want to use `...as.numeric(data$education)..` (either in `barplot` or `hist`). – alexis_laz Jan 12 '14 at 20:28

1 Answers1

1

The problem may be that you pass a data.frame with several columns to barplot while the function expects one numeric vector.

The usual function for relative frequencies is hist, for instance,

hist(subset(infert, case==0)$age, freq=FALSE)

for the relative frequencies of the age column. If it comes to categorical data, you were on the right track with barplot and table

dat <- infert[infert$case==0, "education"]
barplot(table(dat)/length(dat))
Karsten W.
  • 17,826
  • 11
  • 69
  • 103
  • Thank you for the hint on relative frequency. It also looks like I went around my problem incorrectly. What do you reckon would be the correct way to plot a frequency graph of `infert$education` but only for those that `case==0`? – Dimitris Sfounis Jan 12 '14 at 19:47