1

Does anyone know how to get the "basic.stats" (hierfstat), "wc" (hierfstat), and/or other hierfstat commands to work on a genind object that has NA in the @pop section? I can convert the genind to hierfstat, but the other commands are not happy with the NA:

Error in 1:sum(data[, 1] == i) : NA/NaN argument

I would like to keep the original data file intact, because it contains other information and categories where the "NA" samples are included and I would like to work from the one reference file as much as possible. There are just some samples that are from single representatives of populations and so I do not want them to be used/shown in other data (I map the samples later too and don't want these on there).


In short: I have a fasta data file (PvMtFas) made into a genind object (PvMtGen) and added @pop from another file (PvMtData) which contains some NA in the column I am using (and contains loads of other data I use). These NA appear to be stopping me from using the basic.stats and wc commands from hierfstat. Any easy solutions?

PvMtFas <- read.FASTA ('~/Documents/PvMtFasta.fasta')
PvMtGen <- DNAbin2genind(PvMtFas, pop=NULL, exp.char=c("a","t","g","c"), polyThres=1/100)
PvMtData <- read.csv ('~/Documents/PvMtData.csv')
PvMtGen$pop = PvMtData$GeoCatLV29  #PvMtData$GeoCatLV29 contains NA for some sequences

PvMtHier <- genind2hierfstat(PvMtGen)  #this works fine
basic.stats(PvMtGen)  #this doesn't
wc(PvMtGen)  #this doesn't

Any suggestions are welcome! I looked into using "na.exclude" but that didn't work (obviously, because I am not sure how to target the @pop within a genind object and I would want the whole 'row' gone, not just the @pop NA's but also the samples it refers to).

Neeraij
  • 11
  • 1
  • Could you work on making your example [reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)? – Aurèle Oct 26 '17 at 14:28
  • It's not that hard really. Grab a genetic sequence fasta file (nucleotide version), and make a csv file with the sample names in one column (stick to the same order as in the fasta file), and population information at different levels of geography in different columns after it. Replace some of the geographic identifiers with NAs. (sorry, I have example files but it's a faff to get them up here, and I don't know how to put in a picture to show you an example. And I'm not fast at coding...) – Neeraij Oct 27 '17 at 02:42

0 Answers0