Combining rows with naming subsets

Question

I have a large data.frame (over 44,000 rows) which I have managed to aggregate into 128 rows that include my analysis Section and the sum of cells within that section presented as Count, an example of which looks like the following:

  Section   Count
  A - 2-1    10
  A - 2-2     2
  A - 2-3    39
  A - 2-4    31
  A - 2-5    42
  A - 2-6     1
  A - 2-7     9
  A - 2-8     4
  A - 2-9     2
  A - 3-1    52
  A - 3-2    27
  A - 3-3    42
  A - 3-4   107
  A - 3-5     6
  A - 3-6    51
  A - 3-7   155
  A - 3-8     5
  A - 3-9    13

I have looked at related topics but they all deal with having the same ID tag that makes it easy to group. What I want to do is to further aggregate so I can combine all of my Sections (e.g A-2) that have multiple fields (A-2-1, A-2-2, etc...) to give me a total cell Count, so that it appears as follows:

  Section       Count
  A - 2           140
  A - 3           458

I've been trying various ways of doing this based on other threads but cannot get any to work, so any help would be greatly appreciated!

`dplyr` package `your_data_frame %>% group_by(Section) %>% summarise(Count2=sum(Count))` — MikolajM, Jun 23 '17 at 06:20
just realised that there is no column for (fld x) thing so this wont work — MikolajM, Jun 23 '17 at 06:24
@MikolajM If there is some `(fld x)` thing which you need to ignore then `df %>% group_by(sub("\\(.*", "", Section) %>% summarise(Count2=sum(Count))` should do it. — Ronak Shah, Jun 23 '17 at 06:28
Nice. I was thinking about something like `spread(Section, sep=" ")` and then `unite()` the first two columns to do `group_by()` on their basis — MikolajM, Jun 23 '17 at 06:32

AK88 · Accepted Answer · 2017-06-23T09:09:22.047

0

df$Section = substr(df$Section,1,nchar(df$Section)-2)
## df$Section = substr(df$Section, start = 1, stop = 3) # alt option
aggregate(df$Count, by=list(Section=df$Section), FUN=sum)

edited Jun 23 '17 at 09:09

answered Jun 23 '17 at 06:20

AK88

2,946
2
12
31

Thanks for the response but this serves to give me the same output. `Frequency x 1 A - 2 (fld 1) 10 2 A - 2 (fld 2) 2 3 A - 2 (fld 3) 39` – ScienceJr Jun 23 '17 at 06:36
I modified my code for your original data and now I noted that you have a different `Section` column. See if it works now. – AK88 Jun 23 '17 at 09:04
You are an absolute legend thanks so much for that! – ScienceJr Jun 24 '17 at 01:46

Combining rows with naming subsets

1 Answers1