0

I have data with two columns, community number and species

com <- paste(c("1", "1", "1", "1", "2", "2", "3","3","3", "4", "4", "5", "5"))
species <- paste(c("sp1", "sp1", "sp2", "sp4", "sp3", "sp1", "sp5", "sp2","sp2", "sp3","sp3", "sp5", "sp1" ))
data <- as.data.frame(cbind(com, species))
data
   com species
    1     sp1
    1     sp1
    1     sp2
    1     sp4
    2     sp3
    2     sp1
    3     sp5
    3     sp2
    3     sp2
    4     sp3
    4     sp3
    5     sp5
    5     sp1 

What I would like to have is a community matrix like this:

  sp1 sp2 sp3 sp4 sp5
1  2   1   0   1   0
2  1   0   1   0   0   
3  0   2   0   0   1
4  0   0   0   2   0
5  1   0   0   0   5

Thanks in advance!

  • `as.data.frame(cbind(com, species))` is an anti-pattern. `data.frame(com, species)` is much better. `cbind` will convert everything to the same data type, where as using `data.frame` directly lets you keep you columns as different types. – Gregor Thomas Aug 26 '20 at 16:54
  • Try `table(data)` – Henrik Aug 26 '20 at 20:08

1 Answers1

2

I would suggest a tidyverse approach like this reshaping your data:

library(tidyverse)

#Data
com <- paste(c("1", "1", "1", "1", "2", "2", "3","3","3", "4", "4", "5", "5"))
species <- paste(c("sp1", "sp1", "sp2", "sp4", "sp3", "sp1", "sp5", "sp2","sp2", "sp3","sp3", "sp5", "sp1" ))
data <- as.data.frame(cbind(com, species))

#Reshape
data %>% pivot_longer(cols = -com) %>%
  group_by(com,value) %>% summarise(N=n()) %>%
  pivot_wider(names_from = value, values_from=N) %>%
  replace(is.na(.),0)

Output:

# A tibble: 5 x 6
# Groups:   com [5]
  com     sp1   sp2   sp4   sp3   sp5
  <fct> <int> <int> <int> <int> <int>
1 1         2     1     1     0     0
2 2         1     0     0     1     0
3 3         0     2     0     0     1
4 4         0     0     0     2     0
5 5         1     0     0     0     1

If you wanna further arrange all names correctly, you can use next code:

data %>% pivot_longer(cols = -com) %>%
  arrange(value) %>%
  # mutate(value = factor(value,levels = sort(unique(value)),ordered = T)) %>%
  group_by(com,value) %>% summarise(N=n()) %>%
  pivot_wider(names_from = value, values_from=N) %>%
  replace(is.na(.),0) %>%
  select(sort(current_vars()))

Output:

# A tibble: 5 x 6
# Groups:   com [5]
  com     sp1   sp2   sp3   sp4   sp5
  <fct> <int> <int> <int> <int> <int>
1 1         2     1     0     1     0
2 2         1     0     1     0     0
3 3         0     2     0     0     1
4 4         0     0     2     0     0
5 5         1     0     0     0     1
Duck
  • 39,058
  • 13
  • 42
  • 84