What format is the source in (vhs, dvd, stills)? It's possible that the time stamp is encoded in the data.
Update with more detail
While I completely understand the desire to have an automated end-to-end process (especially if you're selling this app as opposed to creating an in-house tool), it'd be more efficient to have someone manually enter the start time for each video (even if there are hundreds of them ) then to spend weeks of coding getting this to work automatically.
What I'd do (failing a simple, very-fast-to-implement, super-accurate OCR solution which I don't believe exists):
Create a couple of database tables, like
video video_group
------- -----------
id id
filename title
start_time date_created
group_id date_modified
date_created date_deleted
date_modified
date_deleted
video_group
might contain
id| title
-----------
1 | Unassigned
2 | 711 Mockingbird @ 75
3 | Kroger storage room
video
would be prepopulated with the video filenames by an import script. Initially assign everything a group_id
of 1 (Unassigned)
Create a simple Winforms or WPF app (pardon my ASCII art):
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
| Group: [=========]\/ [New group...] |
| |
| File: [=========]\/ |
| |
| Preview |
| |--------------------------------------| [Next Video] |
| | (first frame of selected video here) | [Prev] |
| | | |
| | | |
| | | |
| |--------------------------------------| |
| Start Time |
| [(enter start time value here as displayed on preview frame)] |
| |
| [Update] |
-------------------------------------------------------------------
A user (anybody could do this - secretary, janitor, even a recent CS graduate). All they have to do is read the time from the preview frame, type it into the Start Time
field, and Click "update" or "Next" to update the database and move on to the next one. Keep the Group selection from one video to the next unless the user changes it.
Assuming it takes the user 30 seconds to read, type and click next, They could complete 100-150 videos in an hour (Call it 75 for a more realistic estimate). And, interns are a lot cheaper than developer time.
If you really have "hundreds" of videos, it'll still be faster to do it this way than to screw around with OCR. If the OCR works for the most part, you'll most likely need to have someone manually inspect everything to see if the results are correct. which begs the question, why bother with the OCR?