Group multiple rows in R

Question

I have an excel spread sheet with companies names and data regarding sales. Some companies are related and I want to group them together under the same row. I want each column in this row to have the sum of all related companies.

I'm using R, and would appreciate if someone can give me a direction on how to start.

Edit- Example of the data:

Company name	Jan sales	Jan expenses	Jan revenue
Company A	1000	500	500
Company B	2000	500	1500
Company C	3000	400	2600
Company D	4000	100	3900

I want to group all this companies under a single row by the name Company A.

I want it to look like this:

Company name	Jan sales	Jan expenses	Jan revenue
Company A	10000	1500	8500

It would be easier to help if you create a small reproducible example along with expected output. Read about [how to give a reproducible example](http://stackoverflow.com/questions/5963269). Also have you tried https://stackoverflow.com/questions/1660124/how-to-sum-a-variable-by-group ? — Ronak Shah, Mar 25 '21 at 10:06
Does this answer your question? [How to sum a variable by group](https://stackoverflow.com/questions/1660124/how-to-sum-a-variable-by-group) — Marcelo Avila, Mar 25 '21 at 14:06

evian_bottle · Answer 1 · 2021-03-31T08:44:38.823

0

So first of all you have to read your csv-file by using something like:

mycompany <- read.csv("C:/your path here/my_file.csv", sep=";")

To group some companies together you need an id variable that tells you which company is related to which other company. So to help you further it would be helpful if you could provide us an overview of your csv structure.

So give us the output of:

header(mycompany)

EDIT:

to get the above output you could do:

new.table <- c()
new.table <- data.frame(new.table)

for(i in 2:ncol(mycompany)){
  new.table[1,1] <- as.character(mycompany[1,1])
  new.table[1,i] <- sum(mycompany[,i])
}

new.table

edited Mar 31 '21 at 08:44

answered Mar 25 '21 at 10:12

evian_bottle

67
7

Hi, please see above the edit, I added an example of the data frame. thanks again! – jennyro Mar 25 '21 at 11:10
In general, how do you know which companies belong together? Is there a shared id among those or do you just "know" it by their names? So why do you know that companies B, C, and D belong to company A? – evian_bottle Mar 25 '21 at 12:15
No shared ID, it's something I just now. – jennyro Mar 31 '21 at 06:21
Can you explain a bit more about the code you wrote? What is supposed to be written inside c()? What do I need to change? – jennyro Mar 31 '21 at 06:22
So I just edited my post above. First import your csv to R (my first code field). Than go directly to my last code field and run it (no changes needed) – evian_bottle Mar 31 '21 at 08:38
The c() basically just creates an empty vector which then is transformed to an empty data frame. This empty data frame is then filled inside the for-loop with the new data – evian_bottle Mar 31 '21 at 08:43

AnilGoyal · Answer 2 · 2021-03-31T08:57:41.470

So if your related companies are stored in a vector say related, you can proceed as follows.

library(dplyr)

data <- read.table(text = "'Company name'   'Jan sales' 'Jan expenses'  'Jan revenue'
'Company A' 1000    500 500
'Company B' 2000    500 1500
'Company C' 3000    400 2600
'Company D' 4000    100 3900", header = T)

> data
  Company.name Jan.sales Jan.expenses Jan.revenue
1    Company A      1000          500         500
2    Company B      2000          500        1500
3    Company C      3000          400        2600
4    Company D      4000          100        3900

related <- c("Company A", "Company B", "Company C", "Company D")

data %>% group_by(Company = ifelse(Company.name %in% related, "related companies", Company.name)) %>% 
  summarise(across(starts_with("Jan"), ~sum(.)))

# A tibble: 1 x 4
  Company           Jan.sales Jan.expenses Jan.revenue
* <chr>                 <int>        <int>       <int>
1 related companies     10000         1500        8500

Group multiple rows in R

2 Answers2