JAV WHISPER+VAD Request Thread (Taking Requests HERE)

Goodsky

New Member
Aug 31, 2022
8
8
UPDATE DROPPING THE FOLLOWING SUBS SUNDAY

SDAB223 , IPX750, NGOD-171, RBD-403, MVG-047, RBD-838

Then I will release a new working list along with some personal subs


too bad I like your tastes
Yes...I chose it based on your intro.....you should also consider collaborating with free whisper translators @heavenazer @taidai and @ jaga who translate free whisper and Capcut at

https://scanlover.com/d/21676-raw-subtitle-request-subtitle-english-japan-chinese-jav-subtitles/550


They welcome new contributors and requests... that's where I got the srt

Really cool dudes
 

avatarthe

Well-Known Member
Feb 1, 2008
184
282
I have a request, but it's really to help me with a problem I've been having with whisper. I use Whisper with VAD on google colab and at first it was working fine, but now more and more often my subtitles tat to drift ou of sync as the film progresses until by th end of he film the subtiles are completely out of sync. I am extracting the audi fom teh video fles using avidmux and uploading the aac file to google drive, thne run it through the colab. inally I use Deeplv4 to translate the srt to Englsih.

My latest attempt is Ebod-993 with a blond Arai Rima!



Can you use the same source file and whisper it to se if you get better results to help me figure out whats going wrong on mine end?
The film is here:

and here are my drifty subtitles and he aac file...


folder.jpg
 

Attachments

  • EBOD993.dl.en.rar
    20.9 KB · Views: 89

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,737
5,131
Can you use the same source file and whisper it to se if you get better results to help me figure out whats going wrong on mine end?
The film is here:

I can't seek when playing that video so it's a sign that it is corrupted somewhere and probably is the source of your issue with it. Oups, I messed up copying the file apparently. (mpc-hc wouldn't load the video properly for some reason, probably some of the many audio files in the same folder interfering with it.)

I am still doing some tests with the colab but whisper is so inconsistent that it's hard to tell. I did download the 1080p version too and that one isn't broken so if you want to test with that audio yourself, here it is:

Edit: A comparison of your audio on the left(I get an identical file if I extract it myself with avidemux) - the audio in a mka container extracted with mkvtoolnix in the middle - the audio extracted by mp4box(different hash than avidemux) on the right.
The red bar on the right marks that there is a difference on the line, which is basically always. it's green, blue or purple if it finds an identical line from at least 2 of the files.
The one on the left is slightly out of sync(less than a sec) if we look at the "Can I lick you?" line on all 3(it's lower in the middle) compared to the other 2. I assume you meant a bigger delay than that and I don't even know if the other 2 match the video, they could be worse.
Whisper_comp_EBOD-993.jpg

Edit2: 1080p "Can I lick you?" line from avidemux:
Code:
856
01:42:22,634 --> 01:42:24,634
Can I lick you?
Was slightly earlier than the left one at the very beginning.

Edit3: And from 1080p mp4box(translation for it changed):
Code:
861
01:42:23,650 --> 01:42:25,650
Can I lick your penis?

Does seem way off(30+ secs) if I watch the vid but hard to tell. Gonna try to run it on default whisper on a test colab I just made.

Edit4: Default whisper 1080p mp4box:
Code:
1096
01:42:23,500 --> 01:42:24,500
Can I lick you?

So yeah, pretty much always in the same 1 sec range or so but whisper being whisper, it's rarely the same.
Doesn't seem like you're doing anything wrong, looks like it's a whisper issue to me, but I'll try to convert the audio and see if anything changes.

I also attached all the srt files from the tests so far.

Final Edit: The ffmpeg mono wav(the format whisper uses internally) had a very similar result to the first mp4box one(Exact same for that one line I keep comparing) with the usual variance in whisper results so yeah, I'm out of idea to test, nothing wrong as far as I can tell with the audio.
 

Attachments

  • EBOD-993_7-tests.zip
    101.5 KB · Views: 70
  • EBOD-993_ffmpeg_wav.zip
    16.5 KB · Views: 69
Last edited:

bathsheba666

New Member
Sep 18, 2023
8
1
I have a thing about the classic pfmw commanded cuckold series.
IPZ-248, IPZ-288, IPZ-409 - Emiri, Rio, Sayuri
then, if you have time:
IPZ-371, IPZ-400, IPZ-449, Amami, Aino, Mayu
also loving:
HZGD-039

Many thanks.
 

Not2srius

Well-Known Member
Jul 5, 2022
785
857
Older movie but KOP-46 is a favorite of mine.

f57ecce85c5b71a35eac3b47e29a2669.jpg
 

avatarthe

Well-Known Member
Feb 1, 2008
184
282
I can't seek when playing that video so it's a sign that it is corrupted somewhere and probably is the source of your issue with it. Oups, I messed up copying the file apparently. (mpc-hc wouldn't load the video properly for some reason, probably some of the many audio files in the same folder interfering with it.)

I am still doing some tests with the colab but whisper is so inconsistent that it's hard to tell. I did download the 1080p version too and that one isn't broken so if you want to test with that audio yourself, here it is:

Edit: A comparison of your audio on the left(I get an identical file if I extract it myself with avidemux) - the audio in a mka container extracted with mkvtoolnix in the middle - the audio extracted by mp4box(different hash than avidemux) on the right.
The red bar on the right marks that there is a difference on the line, which is basically always. it's green, blue or purple if it finds an identical line from at least 2 of the files.
The one on the left is slightly out of sync(less than a sec) if we look at the "Can I lick you?" line on all 3(it's lower in the middle) compared to the other 2. I assume you meant a bigger delay than that and I don't even know if the other 2 match the video, they could be worse.
View attachment 3324032

Edit2: 1080p "Can I lick you?" line from avidemux:
Code:
856
01:42:22,634 --> 01:42:24,634
Can I lick you?
Was slightly earlier than the left one at the very beginning.

Edit3: And from 1080p mp4box(translation for it changed):
Code:
861
01:42:23,650 --> 01:42:25,650
Can I lick your penis?

Does seem way off(30+ secs) if I watch the vid but hard to tell. Gonna try to run it on default whisper on a test colab I just made.

Edit4: Default whisper 1080p mp4box:
Code:
1096
01:42:23,500 --> 01:42:24,500
Can I lick you?

So yeah, pretty much always in the same 1 sec range or so but whisper being whisper, it's rarely the same.
Doesn't seem like you're doing anything wrong, looks like it's a whisper issue to me, but I'll try to convert the audio and see if anything changes.

I also attached all the srt files from the tests so far.

Final Edit: The ffmpeg mono wav(the format whisper uses internally) had a very similar result to the first mp4box one(Exact same for that one line I keep comparing) with the usual variance in whisper results so yeah, I'm out of idea to test, nothing wrong as far as I can tell with the audio.
Thanks for the extensive evaluation, I looked at all of the different subtitle files you created and I see the same subtitle drift I was having, it's most notible near the end of tehfilmm as it is an additive effect.

Look at the following screen grab, subtitle number 977 "hello" does not belong at 1:55:51.750in this post coital scene (even though the audirogram shows it matching mushi muhi!")....


Hello1.JPG


it belong at 01:56:23, shown in the grab below as the guy answer the phone! (and the audiograph doesn't match "mushi mushi"!) The subs are now 32 seconds out of sync!



hello2.JPG



I think the audiograph is showing that he audio and video are out of sync... any thoughts?
 

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,737
5,131
I think the audiograph is showing that he audio and video are out of sync... any thoughts?
One thing to note, that video player from inside subtitle edit is almost certainly not frame accurate so don't put too much faith into it for timing accuracy and the more you move around, the worse it'll get. Better check with an actual media player.

It was a problem for sure with aegisub but the SE player seems better with the different options and not simply being limited to directshow, but I still wouldn't rely on it. Pretty sure I saw the video get out of sync with good subs timings too in the little I used it.

With that said, I was checking with mpchc and it did seem to get around 30 sec out of sync near the end so probably not an SE problem.

The 720p version you're using has variable framerate so that can potentially affect it, but the 1080p I got doesn't and I didn't see a big difference in sub timing between the two so doubt that's the issue either.

The 1080p I have is almost certainly a re-encode by someone since it uses x264 so that could mean a problem in the source, but then the audio would get out of sync too and doesn't sound like that's the case so not it either.

The only conclusion I can get to is that whisper is unreliable. I don't know why this would happen but maybe people more familiar with it would have a better idea. I'm knowledgeable on video, audio and subtitles but not so much with how whisper works since I only use it to test stuff, I never actually made a sub I watched with it.
 

Zhonda

Well-Known Member
Jun 19, 2022
410
300
I’d like to request URE-099 released this month. Usually I can tell what’s going on from the trailer (JAV is no stranger to recycling plot lines) but needing help with this one. Dude accepts the advances of the cute OL in the office. GF bursts out of the closet to catch them in the act. GF is mad but then GF starts filming them bang. Then the aftermath I don’t understand at all. Needing subs to get the full appreciation of the film. lol

URE-099
1695310925961.jpeg
 

ganggauge

Member
Jan 31, 2023
87
55
Last edited:

summerss

Member
Jun 8, 2009
43
11
Wow! Thanks!

Is there an FAQ on how to set up Whisper + Vad + Colab?

I've tried using whisper through the subtitle app and it's painfully slow and then has horrible glitches where a random text suddenly gets repeated for 10-20 minutes of the movie.
I've just started learning and i facepalmed when i noticed i forgot to change english to japanese thinking that was the option for translation. So i changed it to japanese then clicked that option to translate to english. That could be the issue? looking for faq as well.
 

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,737
5,131
What do you guys actually want to know about using Whisper + VAD?

So many people are looking for a FAQ and constantly asking questions about how to use it that I would consider making one if nobody else does but everything is so obvious to me that I'm not sure what people have issues with.
 

summerss

Member
Jun 8, 2009
43
11
What do you guys actually want to know about using Whisper + VAD?

So many people are looking for a FAQ and constantly asking questions about how to use it that I would consider making one if nobody else does but everything is so obvious to me that I'm not sure what people have issues with.
For me i really just found out about this stuff this 2 days ago and trying it out yesterday. -zero experience with it. I'm searching this forum, reddit, and youtube videos. When i see some command line stuff, python installs, talk about what model and parameters to use and my brain freezes.
I think what people need is how to do this through a traditional user interface, a video tutorial if possible.

Reading through the threads it looks like that's what people are using is SubtitleEdit.
But then i see the Google Colab and VAD thing, tokens, keys and i'm like. Wut?

Basically i need a FAQ for newbs. Show us what to click in SubtitleEdit.
I think anyone can do the first steps. download subtitleedit (its an .exe), videotab, audio-to-text, input, click file you want, pick language you want to translate, click translate to english, click generate, save when done. I picked the purview faster whisper cause that was the default and Large model because i got 3080ti and heard you need powerful GPU for that. What exactly is the difference between the Engines. No clue. It had Faster in it so i clicked it.
 
Last edited:

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,737
5,131
I see. Seems the biggest issue is that people get confused between the different versions available, even though it's all basically the same thing but with some tweaks.

If you say whisper + VAD, people will only think you're talking about the google colab thing(or a local clone of that) since that's 2 separate software working together, not just whisper and the easy UI for that is the colab webpage. You also have a pretty much exact tutorial on how to install it on linux if you just press show code on the relevant steps.