Reading a .tps morphometrics file into R

Question

I am looking to read a .tps file into R.

An example file is now available at:

example file

The actual files I am trying to read into R obviously have many more individuals/IDs (>1000)

The .tps file format is produced by TPSDIG.

http://life.bio.sunysb.edu/morph/

The file is an ANSI plain text file.

The file contains X and Y coordinates and specimen information as follows.

The main difficulty is that specimens vary in the numbers of attributes (eg. some have 4 and some have 6 LM landmarks, some have 2 curves, others none, with thus no associated points).

I have tried working with a for loop and read.table, but can not find a way to account for the varying number of attributes.

Example of start of file

LM=3
1  1
2  2
3  3
CURVES=2
POINTS=2
1 1
2 2
POINTS=2
1 1
2 2
IMAGE=COMPLETE/FILE/PATH/IMAGE
ID=1
SCALE=1
LM=3
1  1
2  2
3  3
CURVES=2
...

Example dummy code that works if all specimens have equal number of attributes.

i<-1
landmarks<-NULL
while(i < 4321){

  print(i)

  landmarks.temp<-read.table(file="filepath", sep=" ", header=F, skip=i, nrows=12, col.names=c("X", "Y"))
  i<-i+13
  landmarks.temp$ID<-read.table(file="filepath", sep=c(" "), header=F, skip=i, nrows=1, as.is=T)[1,1]
  i<-i+1
  landmarks.temp$scale<-read.table(file="filepath", sep=c(" "), header=F, skip=i, nrows=1, as.is=T)[1,1]
  i<-i+2

  landmarks<-rbind(landmarks, landmarks.temp)

  print(unique(landmarks.temp$ID))
}

I think you're going to want to use `scan` and/or `readLines` for finer control ... — Ben Bolker, Mar 15 '12 at 23:09
Thank you Prof. Bolker, however read.table seems to provide as much flexibility as 'scan' (for which it is a wrapper) or 'readLines'. I am starting to think I will need to read line by line (with either 'read.table', 'readLines' or 'scan') and have conditions for each possible value of that line and the previous. I am hopping someone may have went through this leg work. — Etienne Low-Décarie, Mar 16 '12 at 12:42
If you provide more complete example data, someone will surely provide a readLines/regex based solution. — jbaums, Mar 16 '12 at 13:14
You say that CURVES can sometimes be 0. In that case, would there be any POINTS attributes at all (would POINTS=0 or would POINTS be missing)? — jbaums, Mar 16 '12 at 23:37
After mrdwab's answer, I beg Prof. Bolker's pardon, readLines was the key. — Etienne Low-Décarie, Mar 17 '12 at 03:22

A5C1D2H2I1M1N2O1R2T1 · Accepted Answer · 2012-03-17T19:01:53.457

3

I'm not exactly clear about what you are looking for in your output. I assumed a standard data frame with X, Y, ID, and Scale as the variables.

Try this function that I threw together and see if it gives you the type of output that you're looking for:

    read.tps = function(data) {
      a = readLines(data)
      LM = grep("LM", a)
      ID.ind = grep("ID", a)  
      images = basename(gsub("(IMAGE=)(.*)", "\\2", a[ID.ind - 1]))

      skip = LM
      nrows = as.numeric(gsub("(LM=)([0-9])", "\\2", grep("LM", a, value=T)))
      l = length(LM)

      landmarks = vector("list", l)

      for (i in 1:l) {
        landmarks[i] = list(data.frame(
            read.table(file=data, header=F, skip=LM[i],
                       nrows=nrows[i], col.names=c("X", "Y")),
            IMAGE = images[i],
            ID = read.table(file=data, header=F, skip=ID.ind[i]-1, 
                            nrows=1, sep="=", col.names="ID")[2,],
            Scale = read.table(file=data, header=F, skip=ID.ind[i],
                                nrows=1, sep="=")[,2]))
      }
      do.call(rbind, landmarks)
    }

After you've loaded the function, you can use it by typing:

read.tps("example.tps")

where "example.tps" is the name of your .tps file in your working directory.

If you want to assign your output to a new object, you can use the standard:

landmarks <- read.tps("example.tps")

edited Mar 17 '12 at 19:01

answered Mar 16 '12 at 17:37

A5C1D2H2I1M1N2O1R2T1

190,393
28
405
485

Brilliant! readLines and grep were my missing key. This will unlock many future similar problems. – Etienne Low-Décarie Mar 17 '12 at 03:23
I don't know how many people would like this functionality, but perhaps you could publish it on github or somewhere similar. – Roman Luštrik Mar 17 '12 at 06:54
1

@EtienneLow-Décarie, you can take a look at the version of this function [I posted on Git Hub](https://gist.github.com/2062329), in which I've added comments throughout so you can see exactly how I've gone about solving your challenge. – A5C1D2H2I1M1N2O1R2T1 Mar 17 '12 at 19:39
@mrdwab, you have more than solved my issue! Your script is mature enough to import most .tps files. In my edit, I get the file name of the image on which the .tps is based (as you did in your new script). I am trying to end up with a script that will import all .tps file fields. One of the difficulties to creating a generalized .tps import script is that there is no specification of the .tps file format and I do not know how it varies or even what is the extent of the fields that can be included.If you want your script advertised to .tps users, you could contact rohlf@life.bio.sunysb.edu . – Etienne Low-Décarie Mar 21 '12 at 12:05

score 0 · Answer 2 · answered Jul 20 '15 at 22:19

0

Perhaps worth mentioning that there is now an R package geomorph which has a function readland.tps() for this.

answered Jul 20 '15 at 22:19

cengel

272
8
19

Reading a .tps morphometrics file into R

2 Answers2

Linked