1

I have the following datasets:

dataset1:

Class    Value
Yo       53
Save     13
Gold     72
Post     88

dataset2:

Class   Total_goals
Yo       9
Yo       9
Yo       9
Save     4
Save     4
Gold     7
Gold     7
Gold     7
Gold     7
Post     3
Post     3

What I want is to add the Total_goals for each class in the dataset1 from the second dataset.

The expected output will be:

Class    Value     Total_goals
Yo       53        9
Save     13        4
Gold     72        7
Post     88        3

How can I do that?

Devon Oliver
  • 295
  • 1
  • 7
  • 20
Adam Amin
  • 1,406
  • 2
  • 11
  • 23

1 Answers1

0

Use cbind, it will work regardless if you have a matching variable between datasets or not. This does assume that observation levels are the same (i.e., order) between dataframes.

Create your dataframes:

dataset1 = data.frame(c("yo","save","gold", "post"),c(53,13,72,88))
colnames(dataset1) = c("Class","Value")

dataset2 = data.frame(c("yo","save","gold", "post"),c(9,4,7,3))
colnames(dataset2) = c("Class","Total_goals")

Answer:

dataset1 = cbind(dataset1, dataset2$Total_goals)
colnames(dataset1) = c("Class","Value","Total_goals")

*Edited to reflect additional info (i.e., duplicate info in the second dataframe), requires matching variable *

Solution if dataframes are of unequal length with one containing duplicate data.

Create your dataframes:

dataset1= data.frame(c("yo","save","gold", "post"),c(53,13,72,88))
colnames(dataset1) = c("Class","Value")

dataset2 = data.frame(c("yo","save","gold", 
"post","post","gold"),c(9,4,7,3,3,7))
colnames(dataset2) = c("Class","Total_goals")

Answer:

dataset1$Total_goal = dataset2[match(dataset1$Class, dataset2$Class),2]
colnames(dataset1) = c("Class","Value","Total_goals")
Devon Oliver
  • 295
  • 1
  • 7
  • 20
  • 2
    Makes a very dangerous assumption that the rows are in the same order in both data sets. – Gregor Thomas Jan 18 '19 at 16:24
  • "dangerous" is a bit of an overstatement, but an assumption, yes. As long as the observation levels correspond in both data sets you could use the arrange function from dplyr to sort them into the same order. Edited answer to reflect assumption. – Devon Oliver Jan 18 '19 at 16:28
  • It doesn't work when there are duplication in the second dataset. I added more information to give you more explanation. – Adam Amin Jan 18 '19 at 16:32
  • 2
    I appreciate you editing the assumption into your question. I guess I would correct my earlier statement to be a "dangerous recommendation" rather than a "dangerous assumption" - not stating a strong assumption is dangerous. Now that the assumption is stated, it's a solution that explains when it can be used, so no longer dangerous. – Gregor Thomas Jan 18 '19 at 16:38
  • @Adam Amin , I have edited the answer to include a solution that will fulfill the additional requirements resulting from duplication in the second dataset. Hopefully, this meets your needs. – Devon Oliver Jan 18 '19 at 18:16
  • @AdamAmin if the edited answer solves your problem, could you please accept it. – Devon Oliver Jan 23 '19 at 02:08