6

Good evening.

Straight to the point - I would need a script that grabs RDF/JSON structure from specific time interval in WEBVTT file. Does such a thing exist?

RDF/JSON is Talis specified file structure that looks like this:

{ "S" : { "P" : [ O ] } }

WEBVTT implements mentioned structure like this:

0
00:00:00,000 --> 00:00:46,119
{ "S" : { "P" : [ O ] } }

1
00:00:48,000 --> 00:00:50,211
{ "S" : { "P" : [ O ] } }

...

And I would use such file while viewing the video files in such way that when I click on some part of the timeline, script fetches corresponding RDF/JSON code (I'm able to do this now, there is a WEBVTT parser already), and then parser fetches requested information in the object from the RDF/JSON structure.

I was really happy when I saw that jQuery has getJson implemented, but it works only for "normal" json files.

The best thing would probably be to just write the script, but my timing and knowledge are very limited, so I would like to hear any suggestion or solution that anybody might know.

yannick
  • 73
  • 1
  • 6
3mpetri
  • 599
  • 1
  • 9
  • 26
  • 1
    Is it possible that RDF/JSON is just complex JSON structure? O is behaving as an array of data of P vector which is under the S object? – 3mpetri Aug 02 '11 at 09:43
  • 1
    I'm finding it really weird that WebVTT, designed in 2011 as far as I can tell, is not already _fully_ JSON... – jwl Feb 24 '12 at 16:34

2 Answers2

6

I've written a WebVTT parser for my <track>/HTML5 video captioning polyfill Captionator.

Feel free to pick apart the source of the development branch (which has the best WebVTT compliance, so it's probably better to look at that rather than the stable branch.)

The parser code starts here: https://github.com/cgiffard/Captionator/blob/captioncrunch/js/captionator.js#L1686

Ultimately though, what you're describing seems to roughly match the intended use case for the metadata track type (as described in the WHATWG's TimedTextTrack spec.) You can use Captionator (I'd love to recommend to you another library as well, but I'm not aware of anything else that doesn't come bundled with an entire video player, or that implements the TimedTextTrack JS API you'll need) to provide support for it - the TextTrack.oncuechange event and TextTrack.activeCues list enable you to listen for changes to cues when the user seeks within the video timeline. You can then get the text of each cue (less the cue metadata and header) and parse it as JSON. Just set up a caption track like below:

<video src="myvideo.webm" poster="poster.jpg" width="512" height="288">
    <track kind="metadata" src="meta.webvtt" type="text/webvtt" srclang="en" label="Metadata Track" default />
</video>

Then, include the captionator library, initialise it as per the documentation, select your track and set up an event handler. You can access the text of an individual cue like so:

var cueText = document.getElementById("video").tracks[0].activeCues[0].getCueAsSource();

Then just:

var RDFData = JSON.parse(cueText);

Good luck :)

Christopher
  • 534
  • 4
  • 11
  • how are you handling situation where several cues overlap? For example one is from 00:00 to 01:30, and another one 00:00 to 00:50, and you want to have access to both json structure values? I also don't know how to dynamically assign **subject** and **predicate** name to a local variable when fetching some value from specific json file (for example `var a = ''; //how to get the rfd/json subject? var b = ''; //how to get the rdf/json predicate? $('#meta').append(cueObject[a][b][0].value);`). I'm doing some serious mess down here :D - [link]http://b-webdesign.com/multilab/Test05/ – 3mpetri Aug 05 '11 at 13:56
  • 1
    the result of parsing is a list of cues, ordered by time. So you should get it out of the TextTrackList. – Silvia Aug 11 '11 at 04:27
  • @Christopher @Silvia I'm having trouble getting a working example of a metadata `` to work. Could either of you help me out? I [posted a question about it here](http://stackoverflow.com/questions/9370260/reading-metadata-from-the-track-of-an-html5-video-using-captionator). Thanks! – Steph Feb 21 '12 at 00:13
1

It seems that the RDF/JSON is in fact complex and nested JSON structure with vectors, so getJSON function will successfully parse data from it once its fetched from WEBVTT timed structure.

3mpetri
  • 599
  • 1
  • 9
  • 26