Post your JAV subtitle files here - JAV Subtitle Repository (JSP)★NOT A SUB REQUEST THREAD★

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,778
5,351
Could somebody advise the best way to rip audio from vids? I think it was mentioned further back in the thread, but I'm having trouble finding it.
Demux it. Google how for your specific extension or use mkvtoolnix to put just the audio in an mka.
 
  • Like
Reactions: Taako

Electromog

Akiba Citizen
Dec 7, 2009
4,668
2,868
Is VAD only for the online version or is there something like it for when you run whisper on your own computer?
 

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,778
5,351
You can see all the code they use to run it, what they install, etc. so if you reproduce that, you can use it on your own pc too.

It's all for linux though so you'd have to convert it to windows(I'm assuming).
 
  • Like
Reactions: Taako

porgate55555

Active Member
Jul 24, 2021
55
165
Is anyone using a script do auto sub srt with deepl (free version)? Mine broke and I can't seem to fix it so I am looking for an alternative or someone more technical than me who knows how to fix it.
 

porgate55555

Active Member
Jul 24, 2021
55
165
If anyone is intersted, I have a large amount of Whisper generated subtitles. They are NOT cleaned, because I am only interested in understanding the story, not having perfect subs. Duplicates are removed as good as possible and timings slightly fixed.
 

Attachments

  • Whisper_03.02.23.zip
    4.1 MB · Views: 1,012

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,778
5,351
Is anyone using a script do auto sub srt with deepl (free version)? Mine broke and I can't seem to fix it so I am looking for an alternative or someone more technical than me who knows how to fix it.

Post your script and I can see if it's easy to fix or not.
 

porgate55555

Active Member
Jul 24, 2021
55
165
Post your script and I can see if it's easy to fix or not.
Thanks a lot! Everything seems to be working but deepl changed the input css selector, which I fixed but then for some reason it messed with the merge of the chunks into one file. All srt which only have one chunk work, but as soon as more than one is required, it fails.
 

noirzmonster

Member
Dec 27, 2021
20
34
Use the virtual Whisper+VAD version (it is not necessary to use your computer resources) use MP3 files, I recommend M4A, if you have any audio enhancement program turn it up adjust the volume (I use MOVAVI), from page 218 onwards You can find some discussions of using Whisper (tutorial) you can see my posts where I use images to guide you, good luck.

Some (complete) movies take about 15 minutes to 1 hour, but the result is much better than the other options that already exist (that's my opinion, I'm sure someone else contradicts me), you can use the program up to 6 times a day. sometimes a little less. (free has a limit)
Thanks for explaining how to do it. I tried to translate one, despite whisper's translation being much better than other auto translatorsthere were some obvious wrong translations. I tried to clean it up as much as I can. There were few 15-20 second intervals which didn't have any translation maybe sound wasn't loud enough for whisper to pick it up, do you have any suggestions on that? Anyway I tried to fill those intervals as best as I can. Here is my first attempt at translating a JAV. I would appreciate feedbacks.

RCT-925 Studio ROCKET The Cum Swallowing Game Try To Guess Which Shot Of Cum Belongs To Your Boyfriend!


1rct925pl.jpg
 

Attachments

  • RCT-925-EN.zip
    26.9 KB · Views: 239

Makkdom

Well-Known Member
Mar 4, 2019
157
388
If anyone is intersted, I have a large amount of Whisper generated subtitles. They are NOT cleaned, because I am only interested in understanding the story, not having perfect subs. Duplicates are removed as good as possible and timings slightly fixed.
Wow, that is indeed a large amount of file. Thanks!

Edited to add: I admire your taste in porn actresses.
 
Last edited:

soloporhoy666

Active Member
Nov 29, 2021
118
124
Thanks for explaining how to do it. I tried to translate one, despite whisper's translation being much better than other auto translatorsthere were some obvious wrong translations. I tried to clean it up as much as I can. There were few 15-20 second intervals which didn't have any translation maybe sound wasn't loud enough for whisper to pick it up, do you have any suggestions on that? Anyway I tried to fill those intervals as best as I can. Here is my first attempt at translating a JAV. I would appreciate feedbacks.

RCT-925 Studio ROCKET The Cum Swallowing Game Try To Guess Which Shot Of Cum Belongs To Your Boyfriend!


1rct925pl.jpg
This usually happens, when I'm really interested in knowing what happens in a movie scene, I play the movie through Whisper+VAD again, modifying the VAD values (by default it comes in 0.4) I've set it to 0.3, 0.5 and 0.6 the The text lines are usually the same, but it has detected some lines that I had not detected before, then I pass the .srt file to the Subtitle Edit program and add those lines to the best srt file that I consider to be more complete (I only do this with movies that are really worth trying, that I like a lot)
The other option I do is increase the volume of the file and balance the audio (there are programs that do this automatically)
 

Dom047

New Member
May 5, 2016
10
4
This usually happens, when I'm really interested in knowing what happens in a movie scene, I play the movie through Whisper+VAD again, modifying the VAD values (by default it comes in 0.4) I've set it to 0.3, 0.5 and 0.6 the The text lines are usually the same, but it has detected some lines that I had not detected before, then I pass the .srt file to the Subtitle Edit program and add those lines to the best srt file that I consider to be more complete (I only do this with movies that are really worth trying, that I like a lot)
The other option I do is increase the volume of the file and balance the audio (there are programs that do this automatically)
I see u use movavi to balance audio and boost volume, does that make a big diff in dialogue output? are you gettin way more text?
 
  • Like
Reactions: soloporhoy666

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,778
5,351
I see u use movavi to balance audio and boost volume, does that make a big diff in dialogue output? are you gettin way more text?
Be aware that whisper is random. It won't give you the same amount of lines if you rerun the exact same audio unmodified in my experience so comparing like this is mostly meaningless.
 

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,778
5,351
Thanks a lot! Everything seems to be working but deepl changed the input css selector, which I fixed but then for some reason it messed with the merge of the chunks into one file. All srt which only have one chunk work, but as soon as more than one is required, it fails.

I had to change
Code:
"css selector"
with
Code:
By.CSS_SELECTOR
in 3 places for it to work for me.

What do you mean exactly by it fails?
I've tried it with a 5 chunk one and it did produce a combined SRT but it does seem to skip the last line of each chunk(hard to say for sure though), unless something else caused those in the 1 long srt I tested so wondering if your issue is the same and I should look further into it or if it was something else.

Edit: I see that it does throw an error when combining but it does still create the file mostly(minus a few lines) but I don't know how it acted before.

Edit2: If you also get similar result than I do, basically the following portion of code(that merges the chunks together) throws some kind of error but I don't know python enough to debug so hard to tell what exactly does:
Code:
    with open(mysrt, 'r', encoding='utf-8') as f:
        srt = f.read()
        match = re.findall(r'\d+:\d+:\d+,\d+ --> \d+:\d+:\d+,\d+', srt)

    linerList = []
    liner = ""
    with open(wordtxt, "r", encoding="utf-8", errors='ignore') as wordfile:
        lines = wordfile.readlines()
        for line in lines:
            if line != '\n' and line is not lines[-1]:
                liner += line
            elif line != '\n' and len(linerList) == len(match)-1:
                liner += line
                linerList.append(liner)
                break
            else:
                linerList.append(liner)
                liner = ""

    count = 0
    with open(finalsrt, 'w', encoding='utf-8') as resfile:
        for timeline in match:
            resfile.write(f"{count+1}\n")
            resfile.write(timeline+'\n')
            resfile.write(linerList[count])
            resfile.write("\n")
            count += 1

Edit3: And to get rid of various warnings and errors, you can replace:
Code:
# Start a Selenium driver
driver_path=r'C:\Program Files (x86)\chromedriver.exe'
driver = webdriver.Chrome(driver_path)
with
Code:
s = Service("C:\Program Files (x86)\chromedriver.exe")
chrome_options = Options()
chrome_options.add_argument("--ignore-ssl-errors")
# The following 2 are probably unnecessary but doesn't hurt
chrome_options.add_argument("--ignore-certificate-errors")
chrome_options.AcceptInsecureCertificates = True
# Gets rid of USB error
chrome_options.add_experimental_option('excludeSwitches', ['enable-logging'])
# Start a Selenium driver with chosen chrome options
driver = webdriver.Chrome(service=s, options=chrome_options)
And add the following at the beginning(I did after line 3):
Code:
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
 
Last edited:
  • Like
Reactions: mei2

porgate55555

Active Member
Jul 24, 2021
55
165
I had to change
Code:
"css selector"
with
Code:
By.CSS_SELECTOR
in 3 places for it to work for me.

What do you mean exactly by it fails?
I've tried it with a 5 chunk one and it did produce a combined SRT but it does seem to skip the last line of each chunk(hard to say for sure though), unless something else caused those in the 1 long srt I tested so wondering if your issue is the same and I should look further into it or if it was something else.

Edit: I see that it does throw an error when combining but it does still create the file mostly(minus a few lines) but I don't know how it acted before.

Edit2: If you also get similar result than I do, basically the following portion of code(that merges the chunks together) throws some kind of error but I don't know python enough to debug so hard to tell what exactly does:
Code:
    with open(mysrt, 'r', encoding='utf-8') as f:
        srt = f.read()
        match = re.findall(r'\d+:\d+:\d+,\d+ --> \d+:\d+:\d+,\d+', srt)

    linerList = []
    liner = ""
    with open(wordtxt, "r", encoding="utf-8", errors='ignore') as wordfile:
        lines = wordfile.readlines()
        for line in lines:
            if line != '\n' and line is not lines[-1]:
                liner += line
            elif line != '\n' and len(linerList) == len(match)-1:
                liner += line
                linerList.append(liner)
                break
            else:
                linerList.append(liner)
                liner = ""

    count = 0
    with open(finalsrt, 'w', encoding='utf-8') as resfile:
        for timeline in match:
            resfile.write(f"{count+1}\n")
            resfile.write(timeline+'\n')
            resfile.write(linerList[count])
            resfile.write("\n")
            count += 1

Edit3: And to get rid of various warnings and errors, you can replace:
Code:
# Start a Selenium driver
driver_path=r'C:\Program Files (x86)\chromedriver.exe'
driver = webdriver.Chrome(driver_path)
with
Code:
s = Service("C:\Program Files (x86)\chromedriver.exe")
chrome_options = Options()
chrome_options.add_argument("--ignore-ssl-errors")
# The following 2 are probably unnecessary but doesn't hurt
chrome_options.add_argument("--ignore-certificate-errors")
chrome_options.AcceptInsecureCertificates = True
# Gets rid of USB error
chrome_options.add_experimental_option('excludeSwitches', ['enable-logging'])
# Start a Selenium driver with chosen chrome options
driver = webdriver.Chrome(service=s, options=chrome_options)
And add the following at the beginning(I did after line 3):
Code:
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
Thanks for looking into it. That was exactly the issue. It merges the file, but combines the last 2 rows of a chunk together, so the whole thing is not usable. Thanks anyways :)

Edit: Was able to fix it after 3h of reverse engineering. Man was that a pain but was worth the effort. Your improvments make it also better. Thanks again!
 
Last edited:

mei2

Well-Known Member
Dec 6, 2018
250
407
If anyone is intersted, I have a large amount of Whisper generated subtitles. They are NOT cleaned, because I am only interested in understanding the story, not having perfect subs. Duplicates are removed as good as possible and timings slightly fixed.
Wow, thanks for the great collection!
 
  • Like
Reactions: porgate55555

amnscfnt

Active Member
Apr 28, 2008
159
127
I am having problems with Whisper on the collab, might be my computer, it's been weird lately. But I setup Whisper, upload a file, run Whisper and..... nothing. I get the "executing" text at the bottom but the clock just runs. After a while I get this message

Screen Shot 2023-02-04 at 11.20.06 AM.png

What does this mean? If I change to standard runtime, I get a time out error after a while. Any ideas? Thanks in advance!!
 

soloporhoy666

Active Member
Nov 29, 2021
118
124
I see u use movavi to balance audio and boost volume, does that make a big diff in dialogue output? are you gettin way more text?
I'm not sure if this helps, I'm not an expert, I'm just mentioning that possibly this helps to make the dialog of the audio files easier to detect by the VAD program, so it will detect the subtitles a little better, I've I verified that turning up the volume detected phrases that I had not seen before in previous tests, since I used MP3 before, then I went to use M4A and increase the audio, eliminate noise, etc., in some movies the result is much better, but This can only be my guess, so take all you can from the information on the forum and do your own testing.
 
  • Like
Reactions: mei2

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,778
5,351
I am having problems with Whisper on the collab, might be my computer, it's been weird lately. But I setup Whisper, upload a file, run Whisper and..... nothing. I get the "executing" text at the bottom but the clock just runs. After a while I get this message

What does this mean? If I change to standard runtime, I get a time out error after a while. Any ideas? Thanks in advance!!

That warning means whisper isn't actually running since the gpu isn't in use, it's sitting idle and they don't want you to waste time not using a gpu others could be using.

To know what the actual problem is, scroll down at the very bottom and you'll see a log of what it is actually currently doing or if it ran into an issue and is displaying an error.
That error or the last portion of the log is what we need to help you.

Make sure you copied and pasted the input filename into the right place before running whisper btw, you didn't mention if you did or not.
 
  • Like
Reactions: amnscfnt

soloporhoy666

Active Member
Nov 29, 2021
118
124
I am having problems with Whisper on the collab, might be my computer, it's been weird lately. But I setup Whisper, upload a file, run Whisper and..... nothing. I get the "executing" text at the bottom but the clock just runs. After a while I get this message

View attachment 3153964

What does this mean? If I change to standard runtime, I get a time out error after a while. Any ideas? Thanks in advance!!
This error occurs due to the inactivity detected by Whisper, so I recommend that you are doing something within the page, this can happen during the upload of our file, which depending on the size can take between 2 or 5 minutes, sometimes we do Easy to go to see something else in which the file is loaded, the collab page detects inactivity, you have to re-enter the execution environment menu and activate it again.
Screenshot 02-04-2023 11.31.43.png
I usually go back and forth from page to page, sometimes that message appears, which I solve by doing the aforementioned, the best solution to prevent collab from detecting a lack of activity is not leaving the page and what I do very, very often (at all times when I return from another page) refresh the file folder, so that the page detects activity.
Screenshot 01-29-2023 20.04.25.png