data set usage and variable selection

Question

I uploaded the dataset. but how do I show those who died in Europe.

df <- read.csv ('https://raw.githubusercontent.com/ulklc/covid19-timeseries/master/countryReport/raw/rawReport.csv')

europe <-- df[df$region =="Europe"]

df$death [europe]

He says that the object "europe" could not be found. this is what I want to learn. To find the death numbers of the European countries only in the dataset. to show the country name and number of deaths in two columns. — Halil Ünsal, May 03 '20 at 12:31

score 0 · Answer 1 · answered May 03 '20 at 12:42

0

We can filter only the European countries and calculate number of deaths by country.

This can be done in base R :

df1 <- aggregate(death~countryName, subset(df, region =="Europe"), sum)

dplyr

library(dplyr)
df1 <- df %>% 
        filter(region == 'Europe') %>% 
        group_by(countryName) %>% 
        summarise(total_death = sum(death))

and in data.table

df1 <- setDT(df)[region == 'Europe', (total_death = sum(death)), countryName]

answered May 03 '20 at 12:42

Ronak Shah

thank you so much. I want to ask one more thing . I have to subtract from the previous day to find the number of new patients. how to select the previous day in the dataset. with what code. Thank you for your help. – Halil Ünsal May 03 '20 at 12:48
Probably you should ask that as a new question but I think this post might help. https://stackoverflow.com/questions/30606360/subtract-value-from-previous-row-by-group – Ronak Shah May 03 '20 at 12:55
I will ask. but I want to learn something. In the data set, the number of deaths is given for each day. the code you have created has collected every day. my wish is only deaths. The prefix is enough to give the number of deaths of the day. He added the numbers of deaths everyday to date and added them. high figures. – Halil Ünsal May 03 '20 at 13:05
I see. Probably you need `max` then. `df1 <- aggregate(death~countryName, subset(df, region =="Europe"), max)` – Ronak Shah May 03 '20 at 13:08
not max. because he doesn't want the highest death. he wants to know how many died yesterday. – Halil Ünsal May 03 '20 at 13:14

score 0 · Accepted Answer · answered May 03 '20 at 18:49

0

We can also use the subset in aggregate

aggregate(death~countryName, df, subset = region =="Europe"), sum)

Or using rowsum

with(subset(df, region == 'Europe'), rowsum(death, countryName))

answered May 03 '20 at 18:49

akrun

2 Answers2