MTALL-144 Kana Yura - A college student, treated like a sex doll, is defiled in broad daylight when a man invades through her window and inserts his penis while she sleeps.
My "100" lines test I expected to be fairly quick turned out to be way harder than expected. I figured there wouldn't be much dialog with a girl sleeping basically the whole movie but I should have figured the guy would be whispering the whole time and whisper is not great at catching low volume speech so I ended up having to cut 133 audio files with boosted audio I had to feed to whisper one by one to try and figure out what it first missed.
I used the Pro version of the whisperwithVAD colab with identical settings(the default) for both whisper versions and then used normal python whisper with the large-v2 model for the audio files I then manually extracted and amplified the audio of(10-20db).
The fully translated by whisper version is 109 lines.
The transcription only(Japanese text) version of whisper is 168 lines(I expected it would contain duplicates and hallucinations, but nope, just a handful of sound translated as speech and that's it).
The manually processed(aka me translating with AI and adding lines) whisper transcription ended up being 279 lines after removing the commented ones, so about triple what I expected, lol.
What I ended up doing is translating line by line with Gemini(the google AI) and the result was surprisingly good. The biggest problem by far was whisper either not picking up lines or transcribing them wrong. I had to sound it out in romaji to the AI myself for many of them and ask it to find me something close that made sense, which worked well most of the time.
There's a few lines that I had to guess more than I would have liked and some that were translated a bit awkwardly, but overall I'm very pleased with the result, I consider it good enough to be worth the effort even with my extremely limited Japanese knowledge.
The AI helps a lot for translating and speeds things up quite a bit compared to fully doing this manually with a low Japanese understanding.
Translating the whole movie ended up taking roughly 36 hours(not counting the initial whisper transcription) and another 4 hours to retime the whole subtitle file, do some light editing and quality check it.
If anyone is curious about comparing the results, contains the full whisper translation(, the whisper transcription I used as the base( and a .ass that contains the full whisper translation displayed on top of the video and the final manual translation displayed at the bottom(MTALL-144_comparison.ass).
For those who don't care and just want the sub, skip to here and read below:
For the proper version, there's both an .srt and .ass version in . The only difference is that the .ass contains some commented line I either decided not to display or couldn't figure out if they were dialog or gibberish and it also makes use of the actor function to display who is saying which lines(which doesn't matter at all for playback, only help keep track easier when editing).