-1

I am stuck with a 'for' loop and would greatly appreciate some help. I have a dataframe, called 'df' including data for the number of people per household (household_size), ranging from 0 (I replaced the missing values with a 0) to 8, as well as the number of car.

My aim is to write a quick code that computes the average number of cars depending on the household size.

I tried the following:

avg <- function(df){
    i <- df$household_size
    for (i in 0 : 8){
        print(mean(df$car))
    }
}

I'm pretty sure I'm missing something really basic here, but I don't know what.
Thanks everyone for your input.
I wouldn't have used a function for this. However, this is an exercise as part of an introductory coding with R module that specifically requires a for-loop.

tthh
  • 1
  • 1
  • 1
    could you show us a sample of your df? maybe using the result of the ```dput()```function. See https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – elielink Aug 23 '21 at 14:11
  • 3
    In general, we should not use loops for this kind of thing. It is much simpler to use a library such as ``dplyr``. Try something like ``df %>% group_by(car) %>% summarise(size = mean(household_size))``. – user438383 Aug 23 '21 at 14:11
  • @user438383, unfortunately it is part of an introductory coding with R module exercise which specifically asks for a for-loop. However, thanks a lot for your alternative solution. – tthh Aug 23 '21 at 14:18
  • You should specify the value of the household size inside the loop: `print(mean(df$car[df$household_size == i]))`. Also, you define `i` before the loop, then redefine it as the iterator, which is a little wasteful. – Vincent Guillemot Aug 23 '21 at 14:19
  • As a sidenote: I don't really see the point of doing this within a function. – Vincent Guillemot Aug 23 '21 at 14:20

1 Answers1

0

Here a solution to print the mean for each size group using a for loop. Let me know if it worked

for(i in unique(df$household_size)){
  print(paste(i,' : ',mean(df[df$household_size%in%i,car])))
}

As mentioned in a comment, I took away the function part because I don't see the point of having it. But if it's mandatory, you can use lapply, that behaves a bit like a for loop according to me:

lapply(unique(df$household_size), function(i){
  return(paste(i,' : ',mean(df[df$household_size%in%i,car])))
}
)
elielink
  • 1,174
  • 1
  • 10
  • 22