0

I have a question about R programming. If I have two tables like:

Table_1

Name        Flag
AA          0
BB          1
CC          0
DD          1
EE          1
FF          1
GG          0
HH          1

and

Table_2

Name    Value   Flag
AA      20      0
BB      30      1
BB      50      1
CC      40      0
EE      30      1
EE      80      1
GG      20      0
HH      70      1
DD      50      1
FF      10      1

I need the output like Table_3 which is combination of Table_1 and Table_2 such as, whenever there is Flag 1 in Table_2 (common field in both tables) the value should be added to Table_1. The Name and Flag in both tables Table_1 and Table_2 will remain same, for ex. if AA has flag 0 in Table_1 then it will be the same in Table_2. In result table Table_3 rows must be same as Table_1 and values must be added to respective Name from Table_2.

Question: How can I do it? Is it possible to apply For loop (tables have large number of rows) or simply manipulation of tables.

Table_3

Name        Flag   Value
AA          0       0
BB          1       80
CC          0       0
DD          1       50
EE          1       110
FF          1       10
GG          0       0
HH          1       70
user213544
  • 2,046
  • 3
  • 22
  • 52
rjt
  • 39
  • 5
  • You don't need for loops here. Both can be done vectorized (in one operation). Look at `merge` and the `dplyr` package. Your task can be done with a cobination of `dplyr::group_by` and `dplyr::summarize`. – symbolrush Feb 13 '19 at 16:34

2 Answers2

0

Why do you actually need Table_1? We can generate your expected output with only Table_2.

The following does the trick:

df <- structure(list(Name = c("AA", "BB", "BB", "CC", "EE", "EE", "GG", 
                              "HH", "DD", "FF"), Value = c(20, 30, 50, 40, 30, 80, 20, 70, 
                                                           50, 10), Flag = c(0, 1, 1, 0, 1, 1, 0, 1, 1, 1)), .Names = c("Name", 
                                                                                                                        "Value", "Flag"), row.names = c(NA, -10L), spec = structure(list(
                                                                                                                          cols = structure(list(Name = structure(list(), class = c("collector_character", 
                                                                                                                                                                                   "collector")), `   Value` = structure(list(), class = c("collector_character", 
                                                                                                                                                                                                                                           "collector")), `  Flag` = structure(list(), class = c("collector_character", 
                                                                                                                                                                                                                                                                                                 "collector"))), .Names = c("Name", "   Value", "  Flag")), 
                                                                                                                          default = structure(list(), class = c("collector_guess", 
                                                                                                                                                                "collector"))), .Names = c("cols", "default"), class = "col_spec"), class = c("tbl_df", 
                                                                                                                                                                                                                                              "tbl", "data.frame"))
library(dplyr)
df %<>% group_by(Name) %>% summarise(Value = sum(Value * Flag))

Output is:

  Name Value Flag
1   AA     0    0
2   BB    80    1
3   CC     0    0
4   DD    50    1
5   EE   110    1
6   FF    10    1
7   GG     0    0
8   HH    70    1
symbolrush
  • 7,123
  • 1
  • 39
  • 67
0

Added your data for later use:

Table1 <- data.frame(Name = paste(LETTERS[seq(1,8,1)],LETTERS[seq(1,8,1)],sep = ""), 
    Flag = c(0,1,0,1,1,1,0,1))
Table2 <- data.frame(Name = c("AA","BB","BB","CC","EE","EE","GG","HH","DD","FF"), 
                     Value = c(20,30,50,40,30,80,20,70,50,10),
                     Flag = c(0,1,1,0,1,1,0,1,1,1))

Try using packages, the dplyr package for example is really mighty. Here are 2 ways to combine your tables: (Table 1 is not really needed for that since Table2 has all the information)

library(dplyr)

Table3.1 <- Table2 %>%
  group_by(Name) %>%
  summarise(Value = sum(Value*Flag))

Table3.2 <- Table2 %>%
  group_by(Name) %>%
summarise(Value = sum(Value[Flag == 1])) 

Result:

  Name  Value
  <fct>  <dbl>
1 AA         0
2 BB        80
3 CC         0
4 DD        50
5 EE       110
6 FF        10
7 GG         0
8 HH        70
Stefan
  • 61
  • 1
  • 3
  • Thank you very much. Code works fine, but I need both tables for further use in data processing That's why I need the result like in Table_3 which has similar columns like Table_1 has, or directly sum up values in Table_1. – rjt Feb 14 '19 at 08:50