I have a dataset about a university's student body with 10 columns that represent different factors such as their student id, gender, ethnicity, etc.
For right now I'm just interested in the term they were admitted, and their ethnicity because I want to see how the number of students from different ethnic backgrounds has changed over time. So I created a new data frame with two columns called ethnicitydf:
> head(ethnicitydf)
admit_term ethn_desc
1 2011-10-01 White/Caucasian
2 2011-10-01 Filipino/Filipino-American
3 2011-10-01 White/Caucasian
4 2011-10-01 Latino/Other Spanish
5 2011-10-01 East Indian/Pakistani
6 2011-10-01 White/Caucasian
I'm not exactly sure how I would create a plot that has the admit_term (time) in the x-axis and the frequency that each ethnicity occurs for each admit_term. There are 12 unique ethnicities in the second column and I want to have the frequency of all 12 ethnicities for each admit_term (6 terms in total) in one graph, each ethnicity having a different color.
The first step I was thinking was counting up each ethnicity for each term using length(which(ethnicitydf$admit_term == "2011-10-01" & ethnicitydf$ethn_desc == "White/Caucasian"))
for example and recording the data in a new data frame, but I feel like there should be a faster and more efficient way of doing this. Maybe the use of a package? Could any body help me out? Thank you!