Post your JAV subtitle files here - JAV Subtitle Repository (JSP)★NOT A SUB REQUEST THREAD★

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,778
5,351
Hi - may I know the logical explanation on why m4a is better than mp3? So that I will also use m4a in the future.
It's a more modern audio compression format(m4a is aac) than mp3 and it's better at compressing than mp3 is. But opus beats both since it's even more modern and compresses even better so if you're going to convert it, which I wouldn't unless you're modifying the audio in some way, then better use that instead.
Whisper is going to read the original audio just fine so just demux(extract) it instead, zero quality loss this way.
 
  • Like
Reactions: Taako

Chuckie100

Well-Known Member
Sep 13, 2019
748
2,911
There is an existing sub for TOEN-049. Check it out if this does the job:
Thanks but I think I have corrupted copies of the movie as this subtitle file acts the same way about 57 mins into the movie the subtitle and the video are out of sync. I have had difficulty locating another copy of the movie to use but will keep trying.

Thanks again.
 
  • Like
Reactions: Taako

Aikayikes!

Well-Known Member
Apr 26, 2020
93
277
Can I ask which VAD thresholds you tried? It defaults to .4. Was wondering how high to raise or lower it.
I used .5, .6, and .8. The .8 had the most lines, but was off by up to 30-seconds timewise. It wasn't a good test movie as there was a lot of mumbling. I'm going to try something else a little clearer.
 

maelstrom9999

Well-Known Member
Apr 26, 2022
480
417
I used .5, .6, and .8. The .8 had the most lines, but was off by up to 30-seconds timewise. It wasn't a good test movie as there was a lot of mumbling. I'm going to try something else a little clearer.

That's odd. I just tried processing a file at .6 then .2, and the .2 gave more lines. One would think if it is a detection threshold for audio, the lower number would catch more.
 

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,778
5,351
That's odd. I just tried processing a file at .6 then .2, and the .2 gave more lines. One would think if it is a detection threshold for audio, the lower number would catch more.

The problem with comparing setting is that whisper isn't reliable. Even using the exact same settings, you'll get a different number of lines every time(never tested the same audio more than 4 times though but that's a lot already).

No idea how it can sometimes fail and sometimes not with the same everything but it does.
 
  • Like
Reactions: Taako

maelstrom9999

Well-Known Member
Apr 26, 2022
480
417
The problem with comparing setting is that whisper isn't reliable. Even using the exact same settings, you'll get a different number of lines every time(never tested the same audio more than 4 times though but that's a lot already).

No idea how it can sometimes fail and sometimes not with the same everything but it does.
Wow, that is extremely strange. If the code remains the same, it should return the same result with the same input every time.

Beginning to wonder if I should just try re-running those that didn't turn out well and see if they randomly get better.
 

Prinsipe

Member
Aug 31, 2013
58
19
It's a more modern audio compression format(m4a is aac) than mp3 and it's better at compressing than mp3 is. But opus beats both since it's even more modern and compresses even better so if you're going to convert it, which I wouldn't unless you're modifying the audio in some way, then better use that instead.
Whisper is going to read the original audio just fine so just demux(extract) it instead, zero quality loss this way.
May i know the average size of an audio that is extracted from 1-hour video? Because when i am converting a 1-hour video to mp3 it has an average size of 120mb. I wonder what is the size of an extracted audio compared to mp3. Thank you for your explanation btw. :)
 

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,778
5,351
May i know the average size of an audio that is extracted from 1-hour video? Because when i am converting a 1-hour video to mp3 it has an average size of 120mb. I wonder what is the size of an extracted audio compared to mp3. Thank you for your explanation btw. :)

Depends entirely on the bitrate used to encode the original audio. What matters most is the quality of the audio. With equal size, between mp3, m4a(aac) and opus, opus will sound closer to the original, followed by aac, followed by mp3.
And if you encode the original audio to either of those 3 format, you're going to lose some quality which could affect the ability of whisper to recognize voice(but in practice it doesn't seem to have much effect, although it's hard to tell with the randomness of it).
That's why I recommend keeping the audio as-is and just separating it from the video, you get the best possible quality this way and as a bonus, the size is usually smaller than if you use quality settings for mp3s.

To answer your question more directly, these days, I see either 192Kbps aac audio in most HD releases which would be about 87MB per hour if I did my math right or 128Kbps which would be 54MB and also the most common bitrates you'll encounter for any audio from ripped movies.
Sounds like you're using 256Kbps for your mp3s which would be around 115MB with the same math which is bigger than most audio I've seen, it rarely goes above 192Kbps(for ripped content, DVD/BluRay audio is much bigger).
I use 128Kbps opus for my own encodes for the quality version(which is of similar quality than your mp3 but only half the size) and 64Kbps for the small version.
 
  • Like
Reactions: Prinsipe and Taako

ericf

Well-Known Member
Jan 13, 2007
245
529
About Whisper. Does Whisper Collab really accept other formats than mp3? It's the format that is written on the audio_path: line and I can't find any info on audio formats that can be used in the help.
 

porgate55555

Active Member
Jul 24, 2021
55
165
About Whisper. Does Whisper Collab really accept other formats than mp3? It's the format that is written on the audio_path: line and I can't find any info on audio formats that can be used in the help.
I used mp3 in the beginning, changed to mp4 and now wav. Whatever input you give, it converts it to wav mono.
 
  • Like
Reactions: Prinsipe

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,778
5,351
About Whisper. Does Whisper Collab really accept other formats than mp3? It's the format that is written on the audio_path: line and I can't find any info on audio formats that can be used in the help.
It also uses ffmpeg to load the audio so it'll support just about anything, what's in the audio path is just an example.

Okay. Autosub didn't give good results on mono audio. Neither does Vrew.
It gets converted internally in whisper to that either way so all whisper processes is mono audio. For those other software, maybe they try to do extra stuff before whisper and they don't like mono or you're converting to mono in a bad way and it messes up the quality real bad which is why it has issues, although I couldn't say how you'd achieve that other than converting to 16k on stereo first and then 16k mono maybe.
 

Aikayikes!

Well-Known Member
Apr 26, 2020
93
277
Wow, that is extremely strange. If the code remains the same, it should return the same result with the same input every time.

Beginning to wonder if I should just try re-running those that didn't turn out well and see if they randomly get better.
It seems pretty random. I would miss a completely clear line on a .4 than a .8. But overall it is a pretty sweet program. Sometimes only a minor language cleaning is needed. I will never complain about the quality for the cost. :)
 
  • Like
Reactions: Prinsipe

amnscfnt

Active Member
Apr 28, 2008
159
127
First, I really want to thank SamKook, soloporhoy666 and others for posting info about how to use Whisper collab. I finally got around to trying it, got a few videos done and hit the limit. I logged into a new gmail account but still get an error, though it is different from the limit-reached error message. Two questions, one, do you know how long the limit-reached error is in effect, like does it reset after 24 hrs or so? And two, the 2nd error was a time out error, but it came up almost right away. When switching to a new gmail account, perhaps I should have restarted Chrome? Any other suggestions for managing this limit for free users? Thank you again to all posters for helping with Whisper. I will add the srt files I generated here in case anyone else wants them. I have not cleaned them up though. If I do, I will repost if anyone wants them, just let me know. The titles in this archive are

DMOW-163
MIAA-633
QRDA-151
XRLE-002

 

Attachments

  • Archive.zip
    63.9 KB · Views: 358

KingofBugs

Active Member
Aug 31, 2022
64
101
So I have been messing around with Whisper for a while now and as a result have produced a decent amount of un-edited sub files that I probably will never clean up alongside some I do plan to clean and post here. I was initially not planning on posting those un-edited ones here but if there was any interest in those I would be down to post them. As per the ones I plan to clean up I still have my original project of cleaning up Real-674 that I found on SubtitleCat that I have been using Whisper to help me out as another reference and have also decided to work on cleaning the Whisper file for MISM-165 since its probably my favorite deepthroat JAV and there are barely any subs for deepthroat videos out there. It hopefully should not be that bad since despite its length its pretty light on unique dialogue.
 
  • Like
Reactions: amnscfnt

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,778
5,351

There's very little skill involved into doing it yourself so the least you could do is try. All the common problems you could run into have a solution on this thread as well as multiple tutorials on how to do it.

If you run into an issue, just post your detailed problem here and people will help you out.