-1

I'm trying to merge two datasets in R. I normally use dplyr to merge two country-year based data sets. But in this case:

dataset1 is country-year:

start of dataset1

dataset2 is event-based: imagine something like big terror attacks. The events don't happen every year. In some years, there is more than one event.

start of dataset2

Ideal outcome: integrate dataset2 into the country-year format and have a count for the total number of events that year. How would this work?

Community
  • 1
  • 1
PoliSciR
  • 23
  • 2
  • 6
  • 1
    Please provide some example data. It is not possible to help without understanding the structure of your data. Hopefully, your event data includes year and country variables. – lmo Jun 20 '16 at 19:03
  • Hi, welcome to SO. Please consider reading up on [ask] and how to produce a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). It makes it easier for others to help you. – Heroka Jun 20 '16 at 19:06
  • Point taken. I added photos of the datasets where you can see the relevant variables to merge. Does that help? – PoliSciR Jun 20 '16 at 20:17

1 Answers1

0

Assuming each row in your dataset2 represents a single event, this should do what you want:

library(dplyr)

dataset2 %>%
    group_by(location, year) %>%
    summarize(n_events = n()) %>%
    left_join(dataset1, ., by = c("cname" = "location", "year" = "year"))
Mikko Marttila
  • 10,972
  • 18
  • 31
  • In this code, what would be after summarize(n_events = n()) %>% the n here? Also, what does the "." refer to after dataset1 in the left_join function? left_join(dataset1, ., by = c("cname" = "location", "year" = "year")) – PoliSciR Jun 21 '16 at 11:48