With:
read.table(file="line.txt", na.strings = "-",
header=TRUE, stringsAsFactors=FALSE, fill=TRUE)
where "line.txt" the name you gave to your tab-delimited text file.
Use fill=TRUE
to allow for incomplete lines, from ?read.table
:
fill logical. If TRUE then in case the rows have unequal length,
blank fields are implicitly added
na.strings a character vector of strings which are to be interpreted
as NA values. Blank fields are also considered to be missing values in
logical, integer, numeric and complex fields.
To use your sample input, instead of using file="line.txt"
, I am simply doing:
x <-
read.table(text='
position SNP rs11828013 rs7931369 rs567411332 rs184532784 rs7931583 rs555937772 rs9651750 rs9651751 rs9651752 rs73530502
71278426 rs11828013 rs11828013
71278461 rs7931369 - rs7931369
71278482 rs567411332 - - rs567411332
71278519 rs184532784 - - - rs184532784
71278580 rs7931583 - 1.000 - - rs7931583
71278733 rs555937772 - - - - - rs555937772
71278792 rs9651750 - 1.000 - - 1.000 - rs9651750
71278828 rs9651751 - 1.000 - - 1.000 - 1.000 rs9651751
71278915 rs9651752 - 1.000 - - 1.000 - 1.000 1.000 rs9651752
71279052 rs73530502 - 0.116 - - 0.116 - 0.116 0.116 0.116 rs73530502
',na.strings='-', header=TRUE, stringsAsFactors = FALSE, fill=TRUE)
To turn this back into a lower-triangular matrix, you can then do:
x[,1] <- NULL
rownames <- x[,1]
x <- sapply(x[,-1], as.numeric)
rownames(x) <- rownames
x
which returns the matrix:
rs11828013 rs7931369 rs567411332 rs184532784 rs7931583 rs555937772 rs9651750 rs9651751 rs9651752 rs73530502
rs11828013 NA NA NA NA NA NA NA NA NA NA
rs7931369 NA NA NA NA NA NA NA NA NA NA
rs567411332 NA NA NA NA NA NA NA NA NA NA
rs184532784 NA NA NA NA NA NA NA NA NA NA
rs7931583 NA 1.000 NA NA NA NA NA NA NA NA
rs555937772 NA NA NA NA NA NA NA NA NA NA
rs9651750 NA 1.000 NA NA 1.000 NA NA NA NA NA
rs9651751 NA 1.000 NA NA 1.000 NA 1.000 NA NA NA
rs9651752 NA 1.000 NA NA 1.000 NA 1.000 1.000 NA NA
rs73530502 NA 0.116 NA NA 0.116 NA 0.116 0.116 0.116 NA