0

Hello everyone I have data that specifies number of deaths and infections of covid by ZIP CODE it is important to note that some zip codes appear more than one time due to surveys coming out later in the week etc.. I am looking to extract two rows the total sum of zip code 60618 and zip 60624 and their respective column named Cases...Weekly. Below I have included code from my data so you can see what it is I am working with

head(Chicago_Final), 15)
           Cases...Weekly
1   60601       4   
2   60601       13  
3   60601       1   
4   60601       7   
5   60601       5   
6   60601       8   
7   60601       6   
8   60601       4   
9   60601       NA
10  60601       NA      
11  60601       9   
12  60601       2
13  60601       8   
14  60602       2   
15  60602       NA

If I needed zip code 60601 and 60602 I would need to be able to produce a table showing the total sum of the cases weekly respective to each row and just those two zip codes. the data I am working with has thousands of zip codes but to make things easier I only included 15 of them. I need to extract two zip codes and their cases weekly to compare them.

Learning R
  • 25
  • 4

1 Answers1

0

Not quite sure I follow your question but it seems to me that you are asking how to group and sum the cases for each zipcode. You can do that using aggregate(), however you will need to deal with the NA values first (either by imputing or simply omitting). Example using the data your provided:

Chicago_Final <- na.omit(Chicago_Final)
aggregate(Chicago_Final$cases, list(zip = Chicago_Final$zipcode), sum)
        zip    x
    1   6601  67
    2   6602  2

You might want to check this question for more detailed answers.

Moax
  • 50
  • 6