0

There is a data.frame with duplicate values for the variable "Time"

> data.old
             Time  Count  Direction
1    100000630955     95          1
2    100000637570      5          0
3    100001330144      7          1
4    100001330144     33          1
5    100001331413     39          0
6    100001331413     43          0
7    100001334038      1          1
8    100001357594     50          0

You must leave all values without duplicates. And sum the values of the variable "Count" with duplicate values, i.e.

> data.new
             Time  Count  Direction
1    100000630955     95          1
2    100000637570      5          0
3    100001330144     40          1
4    100001331413     82          0
5    100001334038      1          1
6    100001357594     50          1

All I could find these unique values with the help of the command

> data.old$Time[!duplicated(data.old$Time)]
   [1] 100000630955 100000637570 100001330144 100001331413 100001334038 100001357594  

I can do this in a loop, but maybe there is a more elegant solution

Dmitry
  • 87
  • 1
  • 7

3 Answers3

5

Here's one approach using dplyr. Is this what you want to do?

library(tidyverse)
data.old %>%
group_by(Time) %>%
   summarise(Count = sum(Count))

Edit: Keeping other variables

OP has indicated a desire to keep the values of other variables in the dataframe, which summarise deletes. Assuming that all values of those other variables are the same for all the rows being summarised, you could use the Mode function from this SO question.

Mode <- function(x) {
  ux <- unique(x)
  ux[which.max(tabulate(match(x, ux)))]
}

Then change my answer to the following, with one call to Mode for each variable you want kept. This works with both numeric and character data.

library(tidyverse)
data.old %>%
group_by(Time) %>%
   summarise(Count = sum(Count), Direction = Mode(Direction))
Andrew Brēza
  • 7,705
  • 3
  • 34
  • 40
2

here is the one by using aggregating function

data.new<-aggregate( Count~Time , data=data.old, sum, na.rm=TRUE)
RAVI TEJA M
  • 151
  • 4
2
 library(dplyr)  
  data.old %>% group_by(Time) %>% summarise(Count = sum(Count), 
                                       Direction =  unique(Direction))

Of course, assuming you want to keep unique values of Direction column

Megha John
  • 153
  • 1
  • 12