0

I am looking to create a matrix babyList storing the number of unique names in the babynames package for every year in that package. I am only interested in using a loop to do this.

   library(babynames) 
###### Here I create the matrix where I want my output to be in
    babyList <- data.frame(matrix(ncol=2,nrow=range(babynames$year[2]-range(babynames$year[1])),))
    colnames(babyList) <- c("years","unique names")

The for loop is giving me trouble. Here is what I know it will need:

Pseudo code
for (i in babynames$year) {
length(unique(babynames$name[babynames$year == i]

}

How can I put this all together in a correct structure?

user7264
  • 123
  • 8

1 Answers1

1

I would suggest using dplyr::summarise() instead:

library(babynames)
library(dplyr)

babynames %>% 
  group_by(year) %>% 
  summarise(unique_names = length(unique(name))) 

which gives the number of unique names for each year in the babynames dataset:

# A tibble: 138 x 2
    year unique_names
 * <dbl>        <int>
 1  1880         1889
 2  1881         1830
 3  1882         2012
 4  1883         1962
 ... 
tivd
  • 750
  • 3
  • 17