1

I have a data frame called stats with two columns Gender and Transportation.used as shown below:

Gender    Transportation.used
Male      Bus
Male      Car
Female    Car
Male      Car
Male      Motorcycle
Female    Bus

and the list go on.. (or view here: https://i.stack.imgur.com/GROIi.jpg)

data_stats <- read.table(text="Gender   Transportation.used
Male    Bus
Male    Car
Female  Car
Male    Car
Male    Motorcycle
Female  Bus
Female  Bus
Female  Bus
Female  Bus
Male    Car
Female  Car
",header=T)

What I wanna do is to calculate the frequency for gender based on the selected transport. I will need the data later on to plot a percentage bar graph. Desired output as below:

          Male    Female
   Bus    1        4

So how do I calculate in order to get the data? I'm still a beginner in using R, please do help me. Thanks in advance!

yfyang
  • 220
  • 2
  • 7
Annie Tan
  • 259
  • 4
  • 14

2 Answers2

3

Try, for frequencies,

table(stats)

or, for relative frequencies,

prop.table(table(stats))

or, even better (e.g.),

 xtabs(male ~ car, data = stats)

I add few examples:

dt  <- data.frame(gender = rep(c("Male", "Female"), c(4, 2) ), trans = rep(c("Car", "Bus", "Bike"), c(3, 2, 1) ))

table(dt)
        trans
gender   Bike Bus Car
Female    1   1   0
Male      0   1   3

In any case, with the data I have composed from your question, we are working with factors. If you want more options with tables you should operate few class conversions.

EDIT:

Here the answer to the problem you posted in the comments. By adjusting the arguments of dt$colname, you can get a finer control over the final output.

table(dt$gender[dt$trans=="Car"])

Female   Male 
     0      3 
Worice
  • 3,847
  • 3
  • 28
  • 49
  • I'm sorry I already fixed my question. What I mean is how do I calculate the frequency for gender based on **only one selected transport** in a table? I was able to find the frequency table for all the transports but not the selected one. – Annie Tan Apr 15 '16 at 17:33
  • @Annie `table(df)[, "Bus"]`? – Frank Apr 15 '16 at 17:34
  • I edited the question with the precise answer you need. by changing the `dt$varnames` you can finely control the variables you need. – Worice Apr 15 '16 at 17:38
  • @AnnieTan I did a mistake copying the edited code. Now it is correct. – Worice Apr 15 '16 at 20:24
0

You can use table.

We recreate your data.frame. Note that it's better to provide a reproducible example

df <- read.table(text="
Gender    Transportation.used
Male      Bus
Male      Car
Female    Car
Male      Car
Male      Motorcycle
Female    Bus", header=T)

Then you can use table:

table(df$Transportation.used, df$Gender) # here we type `df` twice
with(df, table(Transportation.used, Gender)) # `with` avoids that

In this special case where you have only two columns, table(df) also works and produces the desired output (transposed though).

If you really wants Male as the first column of your table, you can change the order of levels from the factor Gender (arranged alphabetically by default)

levels(df$Gender) # Female comes (alphabetically) before Male
df$Gender <- factor(df$Gender, levels=rev(levels(df$Gender))) # we rearrange Gender levels order

Now with(df, table(Transportation.used, Gender)) is your exact desired output.

Gender
Transportation.used Male Female
Bus           1      1
Car           2      1
Motorcycle    1      0

The most basic graph (but see ?barplot) you can get from this is:

tab <- with(df, table(Transportation.used, Gender))
barplot(tab)

(edit)

Then, if you want a table with a single trasnport mode, you can:

with(df, table(Transportation.used, Gender))["Bus",, drop=FALSE ]
                    Gender
Transportation.used Female Male
                Bus      1    1
Community
  • 1
  • 1
Vincent Bonhomme
  • 7,235
  • 2
  • 27
  • 38
  • I'm sorry I already fixed my question. What I mean is to calculate the frequency for gender based on **only one selected transport** in a table. I was able to find the frequency table for all the transports but not the selected one. – Annie Tan Apr 15 '16 at 17:32
  • completed the answer – Vincent Bonhomme Apr 15 '16 at 17:36