0

Hi guys I am trying to plot a streamgraph using data at the following link: https://www.kaggle.com/START-UMD/gtd. My aim is to streamgraph the frequency of terrorist attacks for each terrorist group of the variable gnamebut my problem is that I don't know how to filter the data frame in order to have all the parameters necessary to plot a streamgraph which are data, key, value, date.

I tried to get to that subset of the original dataframe by using the following code

str <- terror %>%
    filter(gname != "Unknown") %>%
    group_by(gname) %>%
    summarise(total=n()) %>%
    arrange(desc(total)) %>%
    head(20)

But all I managed to get is the frequency of attacks for each terrorist group, without getting the number of attacks for each year. Could you suggest any way to do it? That would be amazing! Thanks for reading guys and for the help.

  • 3
    If you add a [reproducible minimal example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610) you make it easier for us to help you. Please add some data in a form that is easy to copy (see link above) Regarding your question: a combination of group_by(gname), mutate and group_by(gname, year) might do the trick (get total for each gname **and** total for each gname-year combination. – dario Feb 08 '20 at 00:11
  • 2
    To get counts by group and year you should `group_by(gname, iyear)`. – Kent Johnson Feb 08 '20 at 00:28

1 Answers1

1

Dario and Kent are correct. You need to add the iyear variable in the group_by function:

terror %>%
  filter(gname != "Unknown") %>%
  group_by(gname, iyear) %>%
  summarise(total=n()) %>%
  arrange(desc(total)) %>%
  head(20) -> str
str
    # A tibble: 20 x 3
    # Groups:   gname [7]
       gname                                           iyear total
       <chr>                                           <int> <int>
     1 Islamic State of Iraq and the Levant (ISIL)      2016  1454
     2 Islamic State of Iraq and the Levant (ISIL)      2017  1315
     3 Islamic State of Iraq and the Levant (ISIL)      2014  1249
     4 Taliban                                          2015  1249
     5 Islamic State of Iraq and the Levant (ISIL)      2015  1221
     6 Taliban                                          2016  1065
     7 Taliban                                          2014  1035
     8 Taliban                                          2017   894
     9 Al-Shabaab                                       2014   871
    10 Taliban                                          2012   800
    11 Taliban                                          2013   775
    12 Al-Shabaab                                       2017   570
    13 Al-Shabaab                                       2016   564
    14 Boko Haram                                       2015   540
    15 Shining Path (SL)                                1989   509
    16 Communist Party of India - Maoist (CPI-Maoist)   2010   505
    17 Shining Path (SL)                                1984   502
    18 Boko Haram                                       2014   495
    19 Shining Path (SL)                                1983   493
    20 Farabundo Marti National Liberation Front (FML~  1991   492

Then send that to the streamgraph:

str %>% streamgraph("gname", "total", "iyear")

I've always had difficulty annotating these graphs, as far as I know, it had to be done manually:

str %>% streamgraph("gname", "total", "iyear") %>%
  sg_annotate(label="ISIL", x=as.Date("2016-01-01"), y=1454, size=14)

enter image description here

Edward
  • 10,360
  • 2
  • 11
  • 26
  • Thank you very much, I have just a little additional problem. Once i input the code you provided, no output is shown. Do you know what may be the reason? – Antonio Mastroianni Feb 08 '20 at 11:55
  • The output is saved to an R object called str. This is then used as the data for the streamgraph. If you want to see the output on the screen, then type the name of the object. – Edward Feb 09 '20 at 02:21