-1

I have a data.frame object in R and need to:

  1. Group by col_1
  2. Select rows from col_3 such that col_2 value is the second largest one (if there is only observation for the given value of col_1, return 'NA' for instance).

How can I obtain this?

Example:

scored      xg  first_goal scored_mane

1       1 1.03212     Lallana           0

2       1 2.06000        Mane           1

3       2 2.38824   Robertson           1

4       2 1.64291        Mane           1 

Group by "scored_mane", return values from "scored" where "xg" is the second largest. Expected output: "NA", 1

ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
Akim Tsvigun
  • 91
  • 1
  • 8
  • Hi Akim Tsvigun. Could you provide a [minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610). That way you can help others to help you! – dario Feb 24 '20 at 14:06

1 Answers1

1

You can try the following base R solution, using aggregate + merge

res <- merge(aggregate(xg~scored_mane,df,function(v) sort(v,decreasing = T)[2]),df,all.x = TRUE)[,"scored"]

such that

> res
[1] NA  1

DATA

structure(list(scored = c(1L, 1L, 2L, 2L), xg = c(1.03212, 2.06, 
2.38824, 1.64291), first_goal = c("Lallana", "Mane", "Robertson", 
"Mane"), scored_mane = c(0L, 1L, 1L, 1L)), class = "data.frame", row.names = c("1", 
"2", "3", "4")) -> df
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81