0

The word "plot" is poorly choosen, because I do not wish to plot but simply make a table. I hope it's clearer when look at my tables. So here the description: I have about 20 populations and Fst values for each pairwise comparision. I would like to have all populations in the first column and again all 20 populations in a first row. I would then like to insert Fst values in each 'cell' depending on which two populations are compared. All the data I need exist in an excel sheet that I've imported into R (see example below, it distorts it a bit but I hope you still get the idea). Can anyone help with this? All the things I've googled, don't quite match what I'm trying to do.. I tried using dcast, but that didn't work either.

What my table looks like:

     Population.1     Population.2       Fst_mean    
1   North             South           0.35960000        
2   North              East           0.29542000        
3   North              West           0.081191    
4   North           Southwest         0.102930    
5   North           Southeast         0.072594

What I would like to achieve in R:

        North     South      East         West       Southwest     Southeast
North    NA     0.35960000  0.29542000  0.081191     0.102930      0.072594    
South  0.84837     NA        0.124200   0.743233     0.226137      0.364200    
East   0.12530   0.384800     NA        0.126203     0.137389  etc.    
West    
Southwest    
Southeast

dcast(Fst_data, Population.1 + Population.2 ~ Fst_mean)
MrFlick
  • 195,160
  • 17
  • 277
  • 295
diverSM
  • 1
  • 1

1 Answers1

0

I think you want

dcast(Fst_data, Population.1 ~ Population.2, value.var = "Fst_mean")

Here's an example

library(data.table)  

df <- data.table(cbind(expand.grid(1:10, 10:1), value = rnorm(100)))

dcast(df, Var1 ~ Var2, value.var = "value")
James B
  • 474
  • 2
  • 10
  • Thank you @James Bonkowski and @ NelsonGon. However.. it's not quite working the way I imagined. I have way too many cells with NAs, unlike my example (see above). I think the problem is, that it is simply taking column 1 populations and comparing against population 2 and if there is not match, then it inserts NA. However, these populations are compared, just that the name may be in the column of population 2! Do you know what I mean? Is there a way to include that in this code? tell R that if the population is in either column 1 or column 2 (Population.1 or Population.2 insert the value? – diverSM Jun 15 '19 at 06:28