1

I have one data set.This data set contain two columns first is column with ID and second is with VALUE.You can see code and table below :

DATA_TEST <- data.frame(
  ID = c("03740270423222","03740270423222","03740270423222","03740270423222","01380926325248","01380926325248","01380926325248"),
  VALUE = c("100","200","300","200","300","200","300"))

enter image description here

But here in table there are a lot of duplicate, so my intention is only to extract last value separably by each ID so final result should be like in table below:

enter image description here

So can anybody help me how to resolve this problem ?

Darren Tsai
  • 32,117
  • 5
  • 21
  • 51
silent_hunter
  • 2,224
  • 1
  • 12
  • 30

2 Answers2

2

A base R solution with aggregate() and tail()

aggregate(VALUE~ ID, DATA_TEST, tail, 1)

#               ID VALUE
# 1 01380926325248   300
# 2 03740270423222   200

or by dplyr package.

library(dplyr)

option 1: summarise() + last()

DATA_TEST %>%
  group_by(ID) %>%
  summarise(VALUE = last(VALUE))

option 2: slice_tail() <=> slice(n())

DATA_TEST %>%
  group_by(ID) %>%
  slice_tail()
Darren Tsai
  • 32,117
  • 5
  • 21
  • 51
1

In data.table:

DATA_TEST<-data.frame(
  ID=c("03740270423222","03740270423222","03740270423222","03740270423222","01380926325248","01380926325248","01380926325248"),
  VALUE=c("100","200","300","200","300","200","300")
)

library(data.table)

DT <- as.data.table(DATA_TEST)

DT[, .(VALUE = last(VALUE)), by = ID]

               ID VALUE
1: 03740270423222   200
2: 01380926325248   300
daniellga
  • 1,142
  • 6
  • 16