1

I have a very big dataframe. let df below represent it:

df <-as.data.frame(rbind(c("a",1,1,1),c("a",1,1,1),c("a",1,1,1),c("b",2,2,2),c("b",2,2,2),c("b",2,2,2)))
     [,1] [,2] [,3] [,4]
[1,] "a"  "1"  "1"  "1" 
[2,] "a"  "1"  "1"  "1" 
[3,] "a"  "1"  "1"  "1" 
[4,] "b"  "2"  "2"  "2" 
[5,] "b"  "2"  "2"  "2" 
[6,] "b"  "2"  "2"  "2"

I want to create a dataframe like the one below out of it:

     [,1] [,2] [,3] [,4]
 [1,] "a"  "3"  "3"  "3" 
 [2,] "b"  "6"  "6"  "6"

I see several similar posts here, but the answers although very useful need a vector pf all possible values in the first column and so on. my problem is my dataset has about 3000 rows.

How can I get the result in r?

user438383
  • 5,716
  • 8
  • 28
  • 43
Mathica
  • 1,241
  • 1
  • 5
  • 17

3 Answers3

3

We could use group_byand summariseafter using type.convert(as.is=TRUE):

library(dplyr)
df %>% 
    type.convert(as.is=TRUE) %>% 
    group_by(V1) %>% 
    summarise(across(V2:V4, sum))

  V1       V2    V3    V4
  <chr> <int> <int> <int>
1 a         3     3     3
2 b         6     6     6
TarJae
  • 72,363
  • 6
  • 19
  • 66
2

We can use aggregate

df <- type.convert(df, as.is = TRUE)
aggregate(.~ V1, df, FUN = sum)

-ouptut

   V1 V2 V3 V4
1  a  3  3  3
2  b  6  6  6

NOTE: The OP created the data.frame from a matrix and matrix can hold only a single class. Thus, do the type conversion first

akrun
  • 874,273
  • 37
  • 540
  • 662
2

Another aggregate option

> aggregate(. ~ V1, df, function(x) sum(as.numeric(x)))
  V1 V2 V3 V4
1  a  3  3  3
2  b  6  6  6
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81