3

Can I get some help here? Has anyone experienced the following error in plink (Whole genome association analysis toolset) while converting from 'ped','map' format to the binary counterpart 'bed','bim','fam'? I am using Linux and plink v1.90b3j.

Error: Line 1 of .ped file has fewer tokens than expected.

I am using this command in a python script to run it over dozens of files:

plink --file S205 --out S205 --make-bed

For only 2 files out of the 32, in this case, I get this error. The file is exactly like all other since they are also all done previously with the same script. Family, paternal, maternal IDs and sex are all the same for all samples, and as I said, the allelic information is written exactly in the same way as all other 30 working files.

I noticed that the error changes to the following when I change the line ending encoding to "Windows". Other good files work with any type of line ending (Unix, Win, Mac).

Error: Line 4009 of .bim file has fewer tokens than expected.

As an example I leave here the first and last X columns of a working *.ped (S209) and of a non-working (S204).

S209 S209 0 0 1 1 C C C C T T T T ... G G G G G G 

S204 S204 0 0 1 1 T T T T G G G G ... G G G G C C 

Thanks! Daniel

Martin Prikryl
  • 188,800
  • 56
  • 490
  • 992
  • 3
    I found out the problem. My 'ped' file did not have exactly the same number of genotypes as the 'map' file because of low quality bases. My script was skipping those SNPs and not outputing anything to the 'ped'. As the 'map' file was created based on the GATK pileup file positions, there was a mismatch, as all positions are transfered to the 'map' file. Might be useful though to leave this here, but it can be marked as solved. – Daniel Fernandes Jul 06 '15 at 15:57
  • 3
    Post this as a answer. – 4b0 May 30 '18 at 07:00

1 Answers1

2

I found out the problem. My 'ped' file did not have exactly the same number of genotypes as the 'map' file because of low quality bases. My script was skipping those SNPs and not outputing anything to the 'ped'. As the 'map' file was created based on the GATK pileup file positions, there was a mismatch, as all positions are transfered to the 'map' file. Might be useful though to leave this here, but it can be marked as solved.