0

I have two datasets I would like to merge in R: one is a long catch dataset and the other is a small effort dataset. I would like to join these so that I can multiply values for the same years AND industry together. Eg, the small effort columns will be repeated many times over, as they are industry-wide characteristics. I think this is a very simple merge but am having trouble making it work!

  Catch <- data.frame(
  Species = c("a", "a", "c", "c", "a", "b"),
  Industry= c( "ag", "fi", "ag", "fi",  "ag", "fi"  ),
  Year = c("1990", "1990", "1991", "1992", "1990", "1990"),
  Catch = c(0,1,4,7,5,6))


 Effort<-data.frame(
 Industry= c( "ag", "ag", "ag" , "fi", "fi", "fi"),
 Year = c("1990",  "1991", "1992", "1990",  "1991", "1992"),
 Effort = c(0,1,4,7,5,6))

What I have tried so far:

  effort_catch<-merge(Effort, Catch , by.x= Year, by.y=Year )
Alyssa C
  • 79
  • 8
  • 2
    what if you put the `Year` variable in quotes ? – Mike Sep 17 '20 at 14:35
  • Seems like you probably want to merge on both Year and Industry? Since the column names are the same in both data frames, `merge(Catch, Effort)` should do the trick, or `merge(Catch, Effort, by = c("Year", "Industry"))` if you want to be explicit about it, perhaps with the `all = TRUE` arg if you want to keep the rows without matches. – Gregor Thomas Sep 17 '20 at 15:05

2 Answers2

3

I am not sure which one is what you need

transform(
  merge(Catch, Effort, by = c("Industry", "Year"), all.x = TRUE),
  prod = Catch * Effort
)

  Industry Year Species Catch Effort prod
1       ag 1990       a     0      0    0
2       ag 1990       a     5      0    0
3       ag 1991       c     4      1    4
4       fi 1990       a     1      7    7
5       fi 1990       b     6      7   42
6       fi 1992       c     7      6   42

or

transform(
  merge(Catch, Effort, by = c("Industry", "Year"), all = TRUE),
  prod = Catch * Effort
)

  Industry Year Species Catch Effort prod
1       ag 1990       a     0      0    0
2       ag 1990       a     5      0    0
3       ag 1991       c     4      1    4
4       ag 1992    <NA>    NA      4   NA
5       fi 1990       a     1      7    7
6       fi 1990       b     6      7   42
7       fi 1991    <NA>    NA      5   NA
8       fi 1992       c     7      6   42
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
1

Here's a solution using dplyr

library(dplyr)

full_join(Catch, Effort) %>%
   mutate(Multiplied = Catch * Effort)
#> Joining, by = c("Industry", "Year")
#>   Species Industry Year Catch Effort Multiplied
#> 1       a       ag 1990     0      0          0
#> 2       a       fi 1990     1      7          7
#> 3       c       ag 1991     4      1          4
#> 4       c       fi 1992     7      6         42
#> 5       a       ag 1990     5      0          0
#> 6       b       fi 1990     6      7         42
#> 7    <NA>       ag 1992    NA      4         NA
#> 8    <NA>       fi 1991    NA      5         NA

Based on your provided data...

Catch <- data.frame(
   Species = c("a", "a", "c", "c", "a", "b"),
   Industry= c( "ag", "fi", "ag", "fi",  "ag", "fi"  ),
   Year = c("1990", "1990", "1991", "1992", "1990", "1990"),
   Catch = c(0,1,4,7,5,6))


Effort<-data.frame(
   Industry= c( "ag", "ag", "ag" , "fi", "fi", "fi"),
   Year = c("1990",  "1991", "1992", "1990",  "1991", "1992"),
   Effort = c(0,1,4,7,5,6))


Chuck P
  • 3,862
  • 3
  • 9
  • 20