How to fill a dataframe with reference data from another data frame when coulmns and row share values?

Question

I have raw dataframe (called Raw) in R which looks like this

 SPECIES, SITE , StandType, unique_site_sp
   <chr>   <chr> <chr>     <chr>         
AMCR ,      A03  , A  ,       A03AMCR       
AMRE  ,     A03  , A  ,       A03AMRE             
AMRE     ,  A04  , A    ,     A04AMRE

from this I made a dataframe call 'COMP' which has the unique values from SPECIES as the columns heads for blank columns and a column with the unique values from SITE

made this way

unique_site<-as.vector(unique(Raw$SITE))
unique_site

unique_sp<-as.vector(unique(Raw$SPECIES))
unique_sp

COMP<-data.frame(matrix(, nrow=length(unique_site),    ncol=length(unique_sp)))
x <- c(unique_sp)
colnames(COMP) <- x

COMP<-cbind(COMP,unique_site)
COMP

COMP looks like this

AMCR, AMRE, unique_site

 NA,   NA ,        A03

 NA,   NA ,        A04

Now I want to fill out the blank columns in the COMP by referencing Raw. If in RAW$SPECIES = the name of a cloumn in COMP AND the RAW$SITE = COMP$unique_site then the new dataframe cell gets a 1, if not then it would be a 0.

which would make COMP look like this

AMCR, AMRE, unique_site

1,   1 ,        A03

1,   0 ,        A04

I am unfamiliar with this and am unsure where to start. I have already tried this

for (i in 1:length(unique_site))  {
  if(any(Raw$SPECIES == "AMCR") & (Raw$SITE=COMP$unique_site))
  COMP[i,1] = 1
  if(any(Raw$SPECIES == "AMRE") & (Raw$SITE=COMP$unique_site))
  COMP[i,2] = 1
}
else   {  

  COMP[i,j] = 0 }

Welcome to StackOverflow. It's very unclear what you want. Try to implement this on a smaller subset of your dataset so you can actually show the outputs. Moreover, Please read [how to make a reproducible example?](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) so we can see your code and try to answer your question. — M--, Nov 20 '18 at 23:27

score 0 · Answer 1 · answered Nov 21 '18 at 07:55

Welcome to SO!

I guess you are looking for dcast. Here are two solutions: one using library(reshape2) which retains the data.frame and another one using library(data.table) which probably will be faster (useful for large datasets):

library(reshape2)
Raw <- data.frame(stringsAsFactors=FALSE,
                  SPECIES = c("AMCR", "AMRE", "AMRE"),
                  SITE = c("A03", "A03", "A04"),
                  StandType = c("A", "A", "A"),
                  unique_site_sp = c("A03AMCR", "A03AMRE", "A04AMRE")
)

COMP <- dcast(Raw, SITE ~ SPECIES, fun.aggregate=length, value.var="SPECIES")


library(data.table)
Raw <- data.table(stringsAsFactors=FALSE,
          SPECIES = c("AMCR", "AMRE", "AMRE"),
             SITE = c("A03", "A03", "A04"),
        StandType = c("A", "A", "A"),
   unique_site_sp = c("A03AMCR", "A03AMRE", "A04AMRE")
)

COMPDT <- dcast.data.table(Raw, SITE ~ SPECIES, fun.aggregate=length, value.var="SPECIES")

How to fill a dataframe with reference data from another data frame when coulmns and row share values?

1 Answers1