I am currently looking at the co-occurrence of various phenomena (gestures, intonation in speech) in time. As such, the data appears with each variable as it's own column, and phenomena are shown as repeating values while they are co-occuring, as :
Begin Time End Time g-phasing apex syllable words tones
00:00:02.000 00:00:04.266 Zia j'avais
00:00:04.266 00:00:05.390 Preparation Zia j'avais
00:00:05.390 00:00:05.519 Preparation vE j'avais
00:00:05.519 00:00:05.852 Preparation vE j'avais H*
00:00:05.852 00:00:05.910 Preparation de des
00:00:05.910 00:00:05.970 Preparation de des
00:00:05.970 00:00:06.236 Preparation de des
00:00:06.236 00:00:06.276 Preparation di dizaines
00:00:06.276 00:00:06.650 Preparation di dizaines
00:00:06.650 00:00:06.795 Preparation zEn dizaines
00:00:06.795 00:00:06.835 stroke zEn dizaines
00:00:06.835 00:00:07.480 stroke zEn dizaines
00:00:07.480 00:00:07.630 stroke apex zEn dizaines
00:00:07.630 00:00:07.857 stroke zEn dizaines H*
00:00:07.857 00:00:08.080 stroke zEn dizaines
00:00:08.080 00:00:08.120 stroke ddeux de
00:00:08.120 00:00:08.226 Preparation ddeux de
00:00:08.226 00:00:08.290 Preparation ddeux de
00:00:08.290 00:00:08.900 Preparation sy sujets
00:00:08.900 00:00:12.396 Preparation sy sujets
00:00:12.396 00:00:12.410 stroke sy sujets
00:00:12.410 00:00:12.628 stroke ZE sujets
00:00:12.628 00:00:12.776 stroke apex ZE sujets
00:00:12.776 00:00:12.924 stroke ZE sujets
00:00:12.924 00:00:12.990 stroke ZE sujets H*
00:00:12.990 00:00:13.400 stroke ZE sujets
This dataset shows that there are two strokes (one from 00:00:06.795 to 00:00:08.120, and a second one from 00:00:12.396 to 00:00:13.400)
Ideally I would like to be able to count the number of strokes in the dataset, determine how many overlap with a pitch accented syllable (here, the "H*" value in the "tones" column that correspond to the syllables "zEn" and "ZE"), how many do not co-occur with a pitch-accented syllable, etc.
I'm not sure if I should iterrate over rows and create counters, if I should make use of the begin and end times, or if I should restructure the data.. Any help would be greatly appreciated!