I have this dataframe
df <- structure(list(`Prediction (Ge)` = c("Paranthropus", "Paranthropus",
"Homo", "Paranthropus", "Australopithecus", "Paranthropus", "Paranthropus",
"Australopithecus", "Paranthropus", "Australopithecus", "Paranthropus",
"Australopithecus", "Australopithecus", "Australopithecus", "Australopithecus",
"Paranthropus", "Homo", "Australopithecus", "Paranthropus", "Paranthropus",
"Paranthropus", "Paranthropus", "Australopithecus", "Paranthropus",
"Australopithecus", "Paranthropus", "Australopithecus"), `Prediction (Sp)` = c("Australopithecus africanus",
"Paranthropus robustus", "Paranthropus boisei", "Paranthropus robustus",
"Paranthropus robustus", "Paranthropus robustus", "Paranthropus robustus",
"Australopithecus afarensis", "Paranthropus boisei", "Paranthropus robustus",
"Paranthropus robustus", "Paranthropus robustus", "Australopithecus afarensis",
"Australopithecus afarensis", "Australopithecus afarensis", "Paranthropus robustus",
"Homo habilis", "Australopithecus afarensis", "Paranthropus robustus",
"Paranthropus boisei", "Paranthropus boisei", "Paranthropus robustus",
"Australopithecus afarensis", "Paranthropus robustus", "Australopithecus afarensis",
"Paranthropus robustus", "Australopithecus afarensis")), row.names = c(2L,
3L, 6L, 7L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 19L, 20L, 26L,
27L, 28L, 29L, 30L, 31L, 32L, 34L, 35L, 37L, 38L, 42L, 46L, 47L
), class = "data.frame", na.action = structure(c(`1` = 1L, `4` = 4L,
`5` = 5L, `8` = 8L, `16` = 16L, `17` = 17L, `18` = 18L, `21` = 21L,
`22` = 22L, `23` = 23L, `24` = 24L, `25` = 25L, `33` = 33L, `36` = 36L,
`39` = 39L, `40` = 40L, `41` = 41L, `43` = 43L, `44` = 44L, `45` = 45L
), class = "omit"))
The head(df)
allows to visualize how it looks like:
head(df)
Prediction (Ge) Prediction (Sp)
2 Paranthropus Australopithecus africanus
3 Paranthropus Paranthropus robustus
6 Homo Paranthropus boisei
7 Paranthropus Paranthropus robustus
9 Australopithecus Paranthropus robustus
10 Paranthropus Paranthropus robustus
There are two columns, which come from two different predictions.
What I would like to know is if the genus in the second column (Prediction (Sp
) is the same as the genus in Prediction (Ge)
. So this means that we need to compare the first word in the Prediction (Sp)
with the value in Prediction (Ge)
.
If you analyze only the first six rows from head(df)
, I would say that there are 3 rows that are identical (rows number 3, 7 and 10), whereas there are 3 rows that are different (2, 6, 9).
How can I do it with a simple line of code, to get the total number of identical/different values?