0

Screenshot: raw data-frame organization of COVID-Cases in Germany

img

I downloaded the notified COVID-Cases in Germany from an official website. This raw data-frame is organized by the following columns (see also screenshot): "IdCounty", "NameCounty", "DateNotification", "AgeGroup", "Gender", "FreqCases".

What is a clever way in R to collapse/re-arrange/sum-up this raw data-frame by all categories in "AgeGroup" and "Gender", i.e. so this two subpopulation-breakdown variables will disappear, i.e. are collapsed ? Reason: I want to do analyses of the COVID-Cases by counties and time-points, but I don't want to differentiate further by age nor gender, i.e. just take all ages and all genders as sums together.

I struggled with various functions to achieve this, but I am pretty sure there is a smart & clever way to do this quite easily.

camille
  • 16,432
  • 18
  • 38
  • 60
Data77
  • 11
  • 2
  • Whatever you did to try your aggregation (not sure what since you didn't include it), you should have been able to group by only the variables you want, e.g. NameCounty, and not the ones you don't want, e.g. AgeGroup – camille Feb 22 '22 at 18:48

1 Answers1

0
library(tidyverse)

data <- read_csv("https://example.de/covid.csv")

data %>%
  # group only by county
  group_by(IdCounty, NameCounty) %>%
  summarise(FreqCases = sum(FreqCases))
danlooo
  • 10,067
  • 2
  • 8
  • 22