0

I have a dataframe named employee with 100 rows like this :

      Date          Name      ride     food   income   bonus    sallary
1   01 Jan 2020  Ludociel      10       6     330000        0   330000
2   01 Jan 2020 Estarossa      15       8     465000   100000   565000
3   01 Jan 2020   Tarmiel       8      10     420000   100000   520000
4   01 Jan 2020    Sariel       5       8     315000        0   315000
5   01 Jan 2020   Escanor      15       7     435000   100000   535000
6   01 Jan 2020       Ban      13       9     465000   100000   565000
7   01 Jan 2020  Meliodas       6      15     540000   100000   640000
8   01 Jan 2020      King      15      12     585000   100000   685000
9   01 Jan 2020   Zeldris      15      11     555000   100000   655000
10  01 Jan 2020     Rugal      15       6     405000   100000   505000
11  02 Jan 2020  Ludociel      14       6     390000   100000   490000
12  02 Jan 2020 Estarossa      12      14     600000   100000   700000
... 
100 10 Jan 2020     Rugal      13      10     495000   100000   595000

The problem is I want to find which employee that has the highest total sallary from 1 Jan to 10 Jan. My expected output is just a vector like this :

[1] "varName" is the highest with total sallary "varTotal_sallary"

I have tried using for loop + if clause and it only return total of 1 name only, and every name will have the function.

function_ludociel<-function(name, date, sallary){
total=integer()
  for(i in 100){
    if(date[i]=="01 Jan 2020" & name[i]=="Ludociel"){
      total=sum(sallary)
    }
  }
  return(total)
}
ludociel=function_ludociel(employee$name,employee$date,employee$sallary)

After that I planned to combine them in 1 variable and use max(), but i know it is silly to code.

Anyone have solution for this? Thankyou very much...

1 Answers1

1
  • Convert date to actual date class
  • Use aggregate to calculate total salary from 1st Jan to 10th Jan
  • Select row with maximum salary
  • Print the result.
employee$Date <- as.Date(employee$Date, '%d %b %Y')
sub_data <- aggregate(sallary~Name, employee, 
                         subset = Date >= as.Date('2020-01-01') & 
                                  Date <= as.Date('2020-01-10'), sum)

max_data <- sub_data[which.max(sub_data$sallary), ]
sprintf('%s has the highest salary %d', max_data$Name, max_data$sallary)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213