The transcriptions of the COSINE language corpus look as follows:
File type = "ooTextFile"
Object class = "TextGrid"
xmin = 0
xmax = 3931.56874994773
tiers? <exists>
size = 8
item []:
item [1]:
class = "IntervalTier"
name = "Phrases"
xmin = 0
xmax = 3931.56874994773
intervals: size = 1938
intervals [1]:
xmin = 0
xmax = 3.59246613841739
text = "Good morning"
intervals [2]:
xmin = 3.59246613841739
xmax = 3.77632771424237
text = "the dog likes me"
intervals [3]:
xmin = 3.77632771424237
xmax = 8.15464058223137
text = "fish swim"
intervals [4]:
xmin = 8.15464058223137
xmax = 8.53678424963039
text = "Sure."
intervals [5]:
xmin = 8.53678424963039
xmax = 9.54622035219737
text = "Just keep swimming"
The files are in .TextGrid format. How could one go ahead to extract the variables xmin
, xmax
and text
for each of the intervals?
EDIT:
The file type can be treated as a normal text file and read line by line. Which was my solution to the problem. It would still be interesting to know if there is a special way to extract information from these type of files. Thank you for the responses.