-2

I have a dataframe of

years, latitude, longitude
1971, 30.212, -87.423
1971, 30.211, -87.455
1971, 30.111, -94.444
1972, 24.114, -94.231
1972, 25.114, -92.121

I want to find the standard distribution for the latitude column by year, such that a new column is created, and would have a repeating standard distribution for each instance of 1971, and a different sd for 1972, etc.

I believe this may be somewhere in the dplyr universe? having difficulties with this one.

In a logical experession, I am asking: what is the standard deviation for df$latitude, WHEN df$years = "all Patterns (being years)"

alistaire
  • 42,459
  • 4
  • 77
  • 117
Emery
  • 31
  • 1
  • 8

3 Answers3

0
df %>% group_by(year) %>% mutate(lat_sd=sd(latitutde, na.rm=T))
Djork
  • 3,319
  • 1
  • 16
  • 27
0

Assuming your data frame is built like this and is stored in a variable called "df":

year, lat, long
1971, 20, 40

You would need this code using dplyr:

output <- df %>% group_by(year) %>% summarise(dev = sd(lat))

merge(df, output, by = "year")
leeum
  • 264
  • 1
  • 13
0

Another way, using base R...

df$lat_sd <- ave(df$lat, df$year, FUN=sd)
Andrew Gustar
  • 17,295
  • 1
  • 22
  • 32