Hi pal, I played a little more with DVDMM-038. I cut a short 2 min clip from the original video, stripped off the audio (mp3 file) and ran it through Whisper with model_size set at medium (vs the default of LargeV3) and vad_threshold set at 0.2 (vs the default of 0.4). Whisper was then able to pick up the conversation starting ~31 sec vs the old 56 sec. I pasted the "missing dialog" into the previous sub and renumbered the sub using SubtitleEdit. I think the traffic noise in the background contributed to Whisper's inability (with default settings) to detect the conversation starting ~31 seconds. I am posting a revised sub that you may enjoy better.i think your timing is incorrect pal, the dialogue start at 00:00:31, your sub start at 00:00:56. and this release 3.5 hrs long not 4 hrs. can you fix this?
Attachments
Last edited: