I have a data frame with n > 1000, in which each row includes data for the columns Year, which is a numeric year, and Gender, which is either "Male" or "Female". I want to compute a t-test for the proportion of Gender == "Male" pairwise comparisons between Years. I have succeeded in creating a plot of this proportion, for which I have attached the code and plot. I am unable to extend this to the prop_test()
function. I can't attach my data, but code for a sample dataset is included.
sample_data <- as_tibble(data.frame(Gender = sample(c("Male", "Female"), 1000, replace = TRUE),
Year = sample(c(2016, 2017, 2018, 2019), 1000, replace = TRUE)))
sample_data %>%
group_by(Year) %>%
summarise(prop = sum(Gender == "Male", na.rm = TRUE) / n()) %>%
ggplot(mapping = aes(x = Year, y = prop)) +
geom_bar(stat = "identity") +
geom_text(aes(label = round(prop, 2), vjust = -0.25)) +
labs(y = "prop Male")
The resulting plot of proportions male by year:
Please advise on how I can modify my code that I used to generate the plot to compute a pairwise t-test for the proportions. I have tried methods like:
sample_data %>%
prop_test(Gender ~ Year)
But this gives an error:
Error in rowSums(x) : 'x' must be numeric