12

Is there any Java API for SRT subtitles ?

firas
  • 1,463
  • 4
  • 19
  • 42

4 Answers4

7

The actual SRT parsing is performed through regular expressions, which Java is able to manipulate.

The actual regexp is:

protected static final String nl = "\\\n";
protected static final String sp = "[ \\t]*";
Pattern.compile("(?s)(\\d+)" + sp + nl + "(\\d{1,2}):(\\d\\d):(\\d\\d),(\\d\\d\\d)" + sp + "-->"+ sp + "(\\d\\d):(\\d\\d):(\\d\\d),(\\d\\d\\d)" + sp + "(X1:\\d.*?)??" + nl + "(.*?)" + nl + nl);

group 2, 3, 4, and 5 is start time group 6, 7, 8, and 9 is finish time group 11 is subtitle text

Panayotis
  • 1,792
  • 23
  • 32
  • 4
    Thanks! Can you explain what is this: "(X1:\\d.*?)??" for? – kolobok Sep 22 '12 at 07:49
  • 2
    Also there is mistake - "\\n" instead of "\\\n". And it is even better to replace this with "\\r?\\n" (to work both on Windows and Unix) as explained [here](http://stackoverflow.com/questions/454908/split-java-string-by-new-line) – kolobok Sep 22 '12 at 09:44
  • 1
    Where did you see the "\\n" pattern? Now, about the \r character, I'd say that you are right, but for a reason I can not explain right now, even if there are \r characters in the file, this pattern successfully matches and produces correct results! – Panayotis Oct 07 '12 at 15:10
6

I have produced a java logic with which to parse and read different subtitle formats, among them is the popular srt: you can find the code licensed under MIT open source license (free to use for whatever) in my GiT repository:

https://github.com/JDaren/subtitleConverter

You probably just need the basic classes and the SRTFormat class, and with that you can read srt files from an InputStream or get full String[] files once you've finished editing them.

If you do find this useful or I can help you with anything please contact me.

PS: (other supported formats, either partially or fully are .ASS .SSA .STL .SCC and .XML (from W3C's TTAF-DFXP also known as TTML 1.0)

EDIT:

you can find the logic at work in www.subtitleconverter.net

Daren
  • 3,337
  • 4
  • 21
  • 35
  • Needs to be improved. Empty class (e.g. `Region`) and catching `NullPointerException`s don't smell well. – kolobok Jun 27 '13 at 10:44
  • @akapelko Region is a future functionality (that's why its empty) for other formats (to set the subtitles somewhere on the screen), SRT does not offer layout of any kind. NullPointerException could arise on weird cases, so far most have been corrected to check for null first or initialise the variable with size 0. But you are right some refracting would be nice... Still for SRT works very well. – Daren Jun 27 '13 at 11:32
  • 1
    Yay very good stuff :)))))) – Hendy Irawan Mar 02 '15 at 20:09
6

Actually the modified regex from @Panayotis that supports multi-line subtitle text is like this:

protected static final String nl = "\\n";
protected static final String sp = "[ \\t]*";
Pattern.compile(
                    "(\\d+)" + sp + nl
                    + "(\\d{1,2}):(\\d\\d):(\\d\\d),(\\d\\d\\d)" + sp
                    + "-->" + sp + "(\\d\\d):(\\d\\d):(\\d\\d),(\\d\\d\\d)" + sp
                    + "(X1:\\d.*?)??" + nl + "([^\\|]*?)" + nl + nl);

Replace ([^\\|]*?) with any character which have less probability to come as subtitle text. I have currently used "|" character negation rule.

Oomph Fortuity
  • 5,710
  • 10
  • 44
  • 89
privatejava
  • 703
  • 1
  • 9
  • 20
3

There is another basic (and open source) API that can deal with SRT and ASS subtitle here

Parsing SRT :

File file = Paths.get("subtitle.srt").toFile();
SRTSub subtitle = new SRTParser().parse(file);
sofm
  • 31
  • 1