0

this is my first time asking a question in StackOverflow and also my first time coding using R So, please understand if my explanation is unclear :(

I now have a data frame (data2000) that is 1092 x 6 The headers are year, month, predictive horizon, name of the company, GDP Price Index, and Consumer Price Index

enter image description here

I want to create vectors on gdppi and cpi for each month

My ultimate goal is to get the mean, median, interquartile range, and 90th-10th percentile range for each month and I thought this is the first step

and this is the code that I wrote by far

***library(tidyverse)
data2000 <- read.csv("")
for (i in 1:12) {
  i_gdppi <- c()
  i_cpi <- c()
}
for (i in 1:12) {
  if (data2000$month == i) {
  append(i_gdppi,data2000[,gdppi])
  append(i_cpi, data2000[,cpi])
}
}***

Unfortunately, I got an error message saying that Error in if (data2000$month == 1) { : the condition has length > 1

I googled it by myself and in if statement, I cannot use a vector as a condition How can I solve this problem? Thank you so much and have a nice day!

slowpoke
  • 9
  • 3
  • The `ifelse()` function is the vectorized version of a conditional. But your code doesn't quite make sense: it looks like you're just sorting by month. Is that the intention? If so, use `order()`. – user2554330 Jun 26 '22 at 08:33
  • Welcome to SO. Can you give us a [minimal, reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) - including the columns `year`, `month`, `gdppi` and `cpi` of `data2000` and your intended output: how long are you expecting the vectors to be? If 12 you will need to sometimes summarise the values for each month from multiple years - how? Mean? – Andrea M Jun 26 '22 at 08:39
  • I added a photo to my question! And my ultimate goal is to calculate the mean, median, interquartile range, and 90th - 10th percentile range for each month I was thinking about creating a huge vector for each month and by using quartile[month, c(.10, .25, .50, .75, .90) computing the following thank you! – slowpoke Jun 26 '22 at 09:15

1 Answers1

0

If you use the group_by() function then it takes care of sub-setting your data:

library(dplyr)

data2000 <- data.frame(month = rep(c(1:12), times = 2), gdppi = runif(24)*100) # Dummy data

data2000 |>
  group_by(month) |> 
  summarise(mean = mean(gdppi), q10 = quantile(gdppi, probs = .10), q25 = quantile(gdppi, probs = .25)) # Add the other percentiles, as needed

Gives this

# A tibble: 12 x 4
   month  mean   q10   q25
   <int> <dbl> <dbl> <dbl>
 1     1  12.5  3.44  6.83
 2     2  34.7  7.15 17.5 
 3     3  37.8 22.1  28.0 
 4     4  30.3 19.0  23.2 
 5     5  65.7 62.2  63.5 
 6     6  60.7 38.7  47.0 
 7     7  43.0 38.2  40.0 
 8     8  77.9 60.7  67.1 
 9     9  56.3 44.0  48.6 
10    10  53.1 19.6  32.2 
11    11  63.8 40.6  49.3 
12    12  59.0 49.2  52.9 

If you have years and months, then group_by(year, month)

Tech Commodities
  • 1,884
  • 6
  • 13