-1

Subset of data frame where the first two and last two observations are examples of what I would like to resolve.

I have this data frame of about 35'000 observations. The problem is that there are about 5'000 occurences (as exemplified by the first two and last two rows of the image) whereby I have two observations relating to the same COD_DOM but with differing values of RENDIMENTO. What I would like is to calculate the average RENDIMENTO for all COD_DOM which appear twice and thus keep only one observation with the average value.

oliver1902
  • 33
  • 5
  • 1
    Something like `library(dplyr)`, `data %>% group_by(COD_DOM) %>% summarise(RENDIMENTO = mean(RENDIMENTO))`? – iago Aug 10 '21 at 10:02

1 Answers1

1

If your data.frame is just these two columns, you should be able to use:

library(dplyr)

new_df <- data.frame %>%
     group_by(COD_DOM) %>%
     summarize(RENDIMENTO=mean(RENDIMENTO))
jmalston
  • 89
  • 4