0

While I am sure this has been asked before, reading through replies to similar threads still hasn't helped me make this work...

I have two tables, A and B. (This example is simplified, but it's ultimately what I would like to be able to do with two tables that are 16000rows x 390columns and 15800rows x 270columns)

Row names for A are (Arts, History, Maths, etc..., French, HomeEc) while row names for B are (Arts, History, Maths, etc..., Spanish, Photography).

Column names for each table are 20 different students (A: Student1-Student20; B: Student21-Student40)

Values in each table are test scores.

How do I combine these two tables in R, such that the final table contains row names (Arts, History, Maths, etc..., French, HomeEc, Spanish, Photography) without duplications, and the values in each column are the corresponding test score of Student1-40 (or 0, when that row was not present in the original table)?

Thank you so much!

Frank
  • 66,179
  • 8
  • 96
  • 180
  • 1
    Could you please provide a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example), and explain what is unique about your dataframes that you cannot solve this problem using info in the [many existing questions/answers about merging dataframes](https://stackoverflow.com/search?q=%5Br%5Dmerge)? – Jan Boyer Jul 18 '18 at 21:26

1 Answers1

0

So you have 1 set of results (we'll call it df)

df<-data.frame(stringsAsFactors=FALSE,
Subject = c("Arts", "History", "Maths", "French", "HomeEc"),
student_1 = c(50L, 51L, 52L, 53L, 54L),
student_2 = c(67L, 68L, 69L, 70L, 71L),
student_3 = c(82L, 83L, 84L, 85L, 86L))


   Subject student_1 student_2 student_3
1    Arts        50        67        82
2 History        51        68        83
3   Maths        52        69        84
4  French        53        70        85
5  HomeEc        54        71        86

And another set of results (df2)

df2<-data.frame(stringsAsFactors=FALSE,
Subject = c("Arts", "History", "Maths", "Spanish", "Photography"),
student_4 = c(40L, 41L, 42L, 43L, 44L),
student_5 = c(71L, 72L, 73L, 74L, 75L),
student_6 = c(92L, 93L, 94L, 95L, 96L)
)

      Subject student_4 student_5 student_6
1        Arts        40        71        92
2     History        41        72        93
3       Maths        42        73        94
4     Spanish        43        74        95
5 Photography        44        75        96

Combining them as so results<-dplyr::full_join(df, df2, by "Subject") gives you

      Subject student_1 student_2 student_3 student_4 student_5 student_6
1        Arts        50        67        82        40        71        92
2     History        51        68        83        41        72        93
3       Maths        52        69        84        42        73        94
4      French        53        70        85        NA        NA        NA
5      HomeEc        54        71        86        NA        NA        NA
6     Spanish        NA        NA        NA        43        74        95
7 Photography        NA        NA        NA        44        75        96
Emily Kothe
  • 842
  • 1
  • 6
  • 17
  • This is so clear, thank you!!! And thank you for taking the time to format your answer so neatly, it makes sense to me now finally! I really appreciate it! – pleasehelpme Jul 19 '18 at 22:18