I'm trying to create a function, that returns the nth largest group (or element if only 1 exists), of a data frame by uniquely sorting a column of the data frame and then passing that argument as a row argument into the data frame. I'm new at R and I'm in a data science post graduate program. My function seems to not work, it only returns the column names, However, the hard code, does work. I posted my results below.
Just a background, I'm, using a car database. I'm trying to return not just the nth largest value, but all elements of the data frame that correspond to the nth largest price in the price column. The hard code works exactly how I want it to. But the function does not.
nth_largest_group <- function(data_frame, column, n) {
target_col <- data_frame$column
uni_sorted <- unique(sort(target_col, T))[n]
nth_max <- data_frame[target_col == uni_sorted,]
return(nth_max)
}
# [1] price brand model year title_status mileage color vin
# [9] lot state country condition
# <0 rows> (or 0-length row.names)
target_col <- US_Car_df$price
uni_sorted <- unique(sort(target_col, T))[1]
nth_max <- US_Car_df[target_col == uni_sorted, ]
print(nth_max)
# price brand model year title_status mileage color vin lot state country
# 503 84900 mercedes-benz sl-class 2017 clean vehicle 25302 silver wddjk7ea3hf044968 167607883 florida usa
# condition
# 503 2 days left
Data
# dput(head(US_Car_df))
US_Car_df <- structure(list(price = c(6300L, 2899L, 5350L, 25000L, 27700L,
5700L), brand = c("toyota", "ford", "dodge", "ford", "chevrolet",
"dodge"), model = c("cruiser", "se", "mpv", "door", "1500", "mpv"
), year = c(2008L, 2011L, 2018L, 2014L, 2018L, 2018L), title_status = c("clean vehicle",
"clean vehicle", "clean vehicle", "clean vehicle", "clean vehicle",
"clean vehicle"), mileage = c(274117L, 190552L, 39590L, 64146L,
6654L, 45561L), color = c("black", "silver", "silver", "blue",
"red", "white"), vin = c(" jtezu11f88k007763", " 2fmdk3gc4bbb02217",
" 3c4pdcgg5jt346413", " 1ftfw1et4efc23745", " 3gcpcrec2jg473991",
" 2c4rdgeg9jr237989"), lot = c(159348797L, 166951262L, 167655728L,
167753855L, 167763266L, 167655771L), state = c("new jersey",
"tennessee", "georgia", "virginia", "florida", "texas"), country = c(" usa",
" usa", " usa", " usa", " usa", " usa"), condition = c("10 days left",
"6 days left", "2 days left", "22 hours left", "22 hours left",
"2 days left")), row.names = c(NA, 6L), class = "data.frame")