You can do this fairly easily with the tidyverse package:
Load tidyverse here:
library(tidyverse)
Then I'll just recreate your data frame here:
Dataframe
id <- c("gt3dskh",
"gt3dskh",
"olgsjnv",
"abg56ji",
"abg56ji")
score <- c(4,3,5,4,1)
df <- data.frame(id,score)
Finally, you just need to use the this code to group each id and then summarize the mean for each.
Summarize scores:
df %>%
group_by(id) %>%
summarize(mean = mean(score))
This should produce this output:
id mean
<chr> <dbl>
1 abg56ji 2.5
2 gt3dskh 3.5
3 olgsjnv 5
Observing a large dataset
Also just to piggyback off your issue of dealing with large datasets...there are multiple ways to get a cursory look at the data.
Here is one:
glimpse(df)
Which gives you this output showing the rows, columns, variables, and values attributed:
Rows: 5
Columns: 2
$ id <chr> "gt3dskh", "gt3dskh", "olgsjnv", "abg56ji", "abg56ji"
$ score <dbl> 4, 3, 5, 4, 1
There is also a nice function from the skimr library that can be helpful for summarizing a lot of data.
library(skimr)
skim(df)
Which gives you a lot of valuable output as well if you want it:
-- Data Summary ------------------------
Values
Name df
Number of rows 5
Number of columns 2
_______________________
Column type frequency:
character 1
numeric 1
________________________
Group variables None
-- Variable type: character ---------------------------------------------
# A tibble: 1 x 8
skim_variable n_missing complete_rate min max empty n_unique
* <chr> <int> <dbl> <int> <int> <int> <int>
1 id 0 1 7 7 0 3
whitespace
* <int>
1 0
-- Variable type: numeric -----------------------------------------------
# A tibble: 1 x 11
skim_variable n_missing complete_rate mean sd p0 p25 p50
* <chr> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 score 0 1 3.4 1.52 1 3 4
p75 p100 hist
* <dbl> <dbl> <chr>
1 4 5 ▃▁▃▇▃