0

For some reason no matter what I go about trying too append values to a list of mine. I cannot seem to get it right. What I have tried:

suburb_shootings <- list()
add_shootings_to_suburb_list <- function(){

    total_rows <- nrow(shooting_cases[4])
    for(x in 1:total_rows){
        suburb_shootings[[x]] <- shooting_cases[x,4]
    }
}
add_shootings_to_suburb_list()

Alternatively:

add_shootings_to_suburb_list <- function(){

    total_rows <- nrow(shooting_cases[4])
    for(x in 1:total_rows){
        suburb_shootings[[x]] <- append(suburb_shootings, shooting_cases[x,4])
    }
}
add_shootings_to_suburb_list()

OR:

add_shootings_to_suburb_list <- function(){

    suburb_shootings <- list()
    total_rows <- nrow(shooting_cases[4])
    for(x in 1:total_rows){
        suburb_shootings <- append(suburb_shootings, shooting_cases[x,4])
    }
}
add_shootings_to_suburb_list()

This is to be used for visualisation charts later on but I essentially just need to create a list of all suburbs where shooting incidents that occurred in NYC during a time period took place. Even though there may be duplicate suburbs. I.E: "Brooklyn" may repeat itself x amount of times as more than one shooting incident may have occurred on separate occasions

I am new to R, so it is possible I am not using the list data type correctly.

Please correct me with what I am doing wrong.

A line snippet from the relevant CSV file is as follows:

INCIDENT_KEY,OCCUR_DATE,OCCUR_TIME,BORO,PRECINCT,JURISDICTION_CODE,LOCATION_DESC,STATISTICAL_MURDER_FLAG,PERP_AGE_GROUP,PERP_SEX,PERP_RACE,VIC_AGE_GROUP,VIC_SEX,VIC_RACE,X_COORD_CD,Y_COORD_CD,Latitude,Longitude,Lon_Lat

236168668,11/11/2021,15:04:00,BROOKLYN,79,0,,false,,,,18-24,M,BLACK,996313,187499,40.68131820000008,-73.95650899099996,POINT (-73.95650899099996 40.68131820000008)

231008085,07/16/2021,22:05:00,BROOKLYN,72,0,,false,45-64,M,ASIAN / PACIFIC ISLANDER,25-44,M,ASIAN / PACIFIC ISLANDER,981845,171118,40.63636384100005,-74.00866668999998,POINT (-74.00866668999998 40.63636384100005)

230717903,07/11/2021,01:09:00,BROOKLYN,79,0,,false,<18,M,BLACK,25-44,M,BLACK,996546,187436,40.68114495900005,-73.95566903799994,POINT (-73.95566903799994 40.68114495900005)

The data set is some 20k + lines long.

Below is a screenshot of how it is read in as a csv

CSV DATA

  • Please provide a reproducible example with some data and the intended outcome. Maybe I'm missing something, but why not just do `c(suburb_shootings, shooting_cases)`? – Phil Oct 20 '22 at 14:10
  • As you say you're not using lists correctly. what you can do is taking the last example, add a `return(suburb_shootings)` statement inside the function at the end (you can simply add a line `suburb_shootings`) then do `suburb_shootings <- add_shootings_to_suburb_list()` outside the functions (last line). See that variables from outside functions cannot be assigned by `<-` as in your first example. Also look at a course like [this](https://swcarpentry.github.io/r-novice-inflammation/) – Ric Oct 20 '22 at 17:07

1 Answers1

1

Functions don't modify objects outside of their scope. (Unless you use global assignment, which you really shouldn't...)

Your last attempt is almost correct, you just need the function to return the result, and then assign it when you call the function. Functions return whatever the last line of the function is.

add_shootings_to_suburb_list <- function(){

    suburb_shootings <- list()
    total_rows <- nrow(shooting_cases[4])
    for(x in 1:total_rows){
        suburb_shootings <- append(suburb_shootings, shooting_cases[x,4])
    }
    suburb_shootings
}


my_list <- add_shootings_to_suburb_list()
my_list # should print your result

That said, you don't need a function for this. Your function looks like an inefficient way to write suburb_shootings <- as.list(shooting_cases[[4]]). Without seeing your data or knowing your goal, I'm also suspicious that you need a list here at all... is there a reason you don't keep it as a vector?

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
  • Thanks for the response. Sorry for the late reply. Honestly it could very well just be stored as a vector. Essentially a task in this course I am taking for my masters is to visualise shooting data. We are allowed to make use of the data anyway we wish so my intent is to essentially append all the shootings per suburb and not the individual district numbers within a suburb and attach this to a graph. So essentially I am merely trying to append a key (The suburb) to a total amount of shootings. Hence the reason I was trying to create the list was because I could just iterate through it – Phillip Mackenzie Oct 24 '22 at 09:24
  • With a forloop that could add the suburb to a hashmap and then just loop through the list to see how many times the suburbs appears in the list with a very basic counter. So the list is intended to look something like this (On seperate lines of course) - Brooklyn, Bronx, Queens, Manhattan, Bronx, Brooklyn, Manhattan, Queens and so forth. Each entry pertaining to a shooting incident. – Phillip Mackenzie Oct 24 '22 at 09:26
  • As I said I am extremely new to R. The course I am doing is just intended for us to get familiar with R as a whole. I really do appreciate your help though. I am relatively comfortable with Java and other languages but R is a bit different and the community doesn't appear to be as big with the regards to the forums and so forth from what I see. Hence the only reason I was forced to ask. – Phillip Mackenzie Oct 24 '22 at 09:28
  • Iterating through a vector is just as easy and iterating through a list. The main difference between R and most other languages is that since R's most basic data types are vectors, using vectors and native "vectorized" operations is much much more efficient than writing loops. – Gregor Thomas Oct 24 '22 at 13:14
  • Do you happen to know of a faster way for me to loop through a vectored element and find duplicates? So I converted my CSV input into a matrix as I found this to be an incredibly easy and fast way of accessing the information. I also know how to immediately convert an entire row/columns values into a vector if I wish. However as I mention my vector will have duplicate suburb names representing shooting incidents in that Suburb. Do you know of a faster way of finding the total amount of incidents per suburb in a vector such as this without a long for loop using a counter? – Phillip Mackenzie Oct 25 '22 at 07:54
  • Yes, have a look at the R-FAQ [How to sum a variable by group?](https://stackoverflow.com/q/1660124/903061). There's 10+ methods there, all of which will be faster than a loop with a counter, though several of them are designed to work on data frames only. – Gregor Thomas Oct 25 '22 at 12:54
  • `dplyr` is particularly friendly in syntax. I'd strongly recommend reading the [Intro to dplyr vignette](https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.html). – Gregor Thomas Oct 25 '22 at 12:57
  • Thanks . I appreciate it. I ended up figuring out so many faster ways by manipulating the data using tables and matrices and so forth. Got rid of at least 20 lines of code alone. Then figured out a lot of the plotting libraries had their own custom stat name variables I could declare to get it to perform an automatic count operation on my data sets for my. Definitely much faster. Thanks again! – Phillip Mackenzie Oct 28 '22 at 06:45