Post your JAV subtitle files here - JAV Subtitle Repository (JSP)★NOT A SUB REQUEST THREAD★

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,737
5,131
Is there a significant difference in extracting the mp3 rather than converting when running the audio file to whisper?
The difference is not big, likely unnoticeable in most cases. but it could hurt recognition and almost certainly won't help if you convert instead of extract.
hahaha no way. I'm so glad it was easy for you.
But my computer is kinda old:p I do wonder what graphic card will make it run better on install?
A very expensive nvidia gamer gpu(RTX 3080 in my case) is what most people would have that can work well enough for it, but just to do ai stuff, you'd buy a datacenter one like the 3 they mention on the colab page.
 
  • Like
Reactions: Taako

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,737
5,131
Is there a way to take a video file, then make an audio only file from it? I mean, if it won't accept video files, that would be the only solution, right?

Yes MkvToolNix GUI is your friend:)
From the screenshot, his video is a mp4 so I don't think mkvtoolnix works for that, at least it didn't before, only for mkv.
Multiple people mentionned some so just pick one. i use mp4box with a separate gui but the new version complicates things so not the easiest to install.

Edit: Googling "demux mp4" will get you plenty of options, but here's a tutorial to do it with vlc which you likely already have: https://bartman88.blogspot.com/2018/04/how-to-extract-or-demux-audio-or-video.html

Edit2: Oh, you can do it with mkvtoolnix, nice. Can't put anything inside an mp4 but you can take it out. Well, not quite take it out but put it inside an mka which is a mkv with only audio so should do the trick.
 
Last edited:
  • Like
Reactions: Taako

maelstrom9999

Well-Known Member
Apr 26, 2022
480
417
Well I decided to keep trying to make the video files work for awhile before bothering with that and holy shit, it finally worked. I can only get it to work by using the embedded "choose files" in the upload audio section, finding the file in its path, then waiting for the whole file to upload, which took roughly 2 hours in this case. At which point, with this method, you don't need to copy the filepath. It is automatically populated into the file path field under play whisper the moment the upload is complete. Then (but not before) hit the "play whisper" arrow.

Seems to have taken roughly an hour to throw an output file into my download folder. Haven't checked syncing yet, but so far as dialogue, I'd rate it as an above average autosub, as compared to say raw autosubs made from Chinese softsubs. It's definitely in cleanable shape.

The title is midd-847, but I won't upload until I've cleaned it at least somewhat.

Now I'm going back to check the files that failed and try doing them this way instead. If this can work on all or even most of my video files, it's going to be great. I still have 36 titles in my faves folders with no sub files. Gonna get all of them. Will post all here of course.

Thanks for all the help provided above.
 

Taako

Akiba Citizen
May 25, 2017
1,332
938
From the screenshot, his video is a mp4 so I don't think mkvtoolnix works for that, at least it didn't before, only for mkv.
Multiple people mentionned some so just pick one. i use mp4box with a separate gui but the new version complicates things so not the easiest to install.

Edit: Googling "demux mp4" will get you plenty of options, but here's a tutorial to do it with vlc which you likely already have: https://bartman88.blogspot.com/2018/04/how-to-extract-or-demux-audio-or-video.html

Edit2: Oh, you can do it with mkvtoolnix, nice. Can't put anything inside an mp4 but you can take it out. Well, not quite take it out but put it inside an mka which is a mkv with only audio so should do the trick.
Yes. Mkvtoolnix works for all my need. I literally don't used anything else lol:)
 

Imscully

Well-Known Member
Apr 1, 2014
359
636
Hey friends,
I need to make a quick rant.
I was just on SubtitleCat and saw that someone absolutely stole my first 100% original work, BKD-97. That was a true labor of love, as that movie has rocked my particularly pervy boat for years. It was my first baby. I'm especially proud of it as when I watched it with a fellow kinky lady friend of mine who doesn't understand a word of Japanese, she actually thought the subtitles I had created were actually "real". When I told her that I had created every word of dialog, she was floored. (Before anyone asks, she's just a very good friend who also happens to share some of my personal kinks. We don't date. Although she gave me a few blowjobs way back when, we don't play sexually anymore. She's just a friend and a damned good one at that. Although we share some similar kink interests, our friendship has gone light years beyond that to the point where we rarely even discuss kink anymore. But I digress.)
So, I'm a little bit pissed right now.
I don't get upset when sites (like New-Jav) remove my name from the subtitles (although it's not fun or polite), but this may be the first time I've seen someone take my work, eliminate my name, and put their own name as the author. What a scumbag thing to do. I've talked about this type of thing before, so I won't go further into it. I'll let it go and get back to work. I just wanted to vent to my fellow JAV subtitle fans.
Here is the link to the file that some asshole named DevilGrrl edited to make it look like he/she was the actual author. I have no recourse, so I won't waste any more thought or energy on it. Just needed to vent.
File Stolen by DevilGrrl.jpg
 
Last edited:

quay2

Active Member
Nov 28, 2009
96
177
The online version always times out.
I attached three Whisper subs (SDDH-001, SHYN-074 and MVSD-486), quite flawed, SDDH-001 is missing a ton of text.
 

Attachments

  • WhisperSubs.zip
    32 KB · Views: 235
  • Like
Reactions: Imscully and Taako

quay2

Active Member
Nov 28, 2009
96
177
You gotta be careful since extracting(aka demuxing or demultiplexing) audio isn't the same as converting(aka re-encoding) audio to a specific format. Both will give you an audio file on its own as a result(so you can technically say re-encoding it is extracting it too) so the difference might not be obvious to those unfamiliar with how these things work.

The former doesn't change the audio in any way and is basically instantaneous or takes a couples secs to do, it just takes the audio part out of the file container that holds both video and audio and the latter takes that audio part and completely re-encodes it to a different format or even the same to create an entirely new file and can take a few seconds or minutes depending on your pc.

I think VLC can do both of those things though, but the first one is better since re-encoding means quality loss(unless it's to a lossless format like wav, but those files are huge).
Proper audio for Whisper (no filtering):
"C:\Program Files\ffmpeg\bin\ffmpeg.exe" -i "X:\SomeJAVMovie.mp4" -ar 16000 "X:\SomeJAVMovie.wav"
 
  • Like
Reactions: Taako

Taako

Akiba Citizen
May 25, 2017
1,332
938
Hey friends,
I need to make a quick rant.
I was just on SubtitleCat and saw that someone absolutely stole my first 100% original work, BKD-97. That was a true labor of love, as that movie has rocked my particularly pervy boat for years. It was my first baby. I'm especially proud of it as when I watched it with a fellow kinky lady friend of mine who doesn't understand a word of Japanese, she actually thought the subtitles I had created were actually "real". When I told her that I had created every word of dialog, she was floored. (Before anyone asks, she's just a very good friend who also happens to share some of my personal kinks. We don't date. Although she gave me a few blowjobs way back when, we don't play sexually anymore. She's just a friend and a damned good one at that. Although we share some similar kink interests, our friendship has gone light years beyond that to the point where we rarely even discuss kink anymore. But I digress.)
She is indeed a good friend. I have female friends like that. 2 of them are sex buds, but all of them are good. It's funny, because this afternoon I'm treating one to lunch/dinner. We do this always. One buy or treats:)
Anywayyyyyssss....
So, I'm a little bit pissed right now.
I don't get upset when sites (like New-Jav) remove my name from the subtitles (although it's not fun or polite), but this may be the first time I've seen someone take my work, eliminate my name, and put their own name as the author. What a scumbag thing to do. I've talked about this type of thing before, so I won't go further into it. I'll let it go and get back to work. I just wanted to vent to my fellow JAV subtitle fans.
Here is the link to the file that some asshole named DevilGrrl edited to make it look like he/she was the actual author. I have no recourse, so I won't waste any more thought or energy on it. Just needed to vent.
You have the right to be pissed. People can be ass**** no doubt.
And before someone say, " oh well that's to be expected... or whatever. It still doesn't make it right.
As you know, that's why I don't share my full subs anymore unless it's people I trust.

Again you have the right to be piss. Subbing is hard enough, but to actually take someone ideas and put your name on it... is an actual crime called plagiarism.
If you were writing a book/novel/thesis and did an excerpt, legally that person might be liable and could be sued.

I certainly use subs and might clean them...BUT I have NEVER taken credit and always give credit to the original author if I can.
 
  • Like
Reactions: mei2 and Imscully

Taako

Akiba Citizen
May 25, 2017
1,332
938
A very expensive nvidia gamer gpu(RTX 3080 in my case) is what most people would have that can work well enough for it, but just to do ai stuff, you'd buy a datacenter one like the 3 they mention on the colab page.
This what I thought. Thank you for confirming it for me:)

If I do decide to use Whisper in the future, I rather have it install.
 

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,737
5,131
Proper audio for Whisper (no filtering):
"C:\Program Files\ffmpeg\bin\ffmpeg.exe" -i "X:\SomeJAVMovie.mp4" -ar 16000 "X:\SomeJAVMovie.wav"
There's no benefit to using wav instead of just copying the audio and wav files will be 10-11 times bigger on average so that means if you have to upload it, it'll take that much longer.

I'd just do something like this instead after checking what the audio type is with mediainfo to set the extension to the right thing:
Code:
ffmpeg -i BBAN-264.mp4 -vn -acodec copy BBAN-264.aac
 
  • Like
Reactions: Taako and mei2

mei2

Well-Known Member
Dec 6, 2018
246
405
Code:
ffmpeg -i BBAN-264.mp4 -vn -acodec copy BBAN-264.aac

@SamKook , I'd suggest to add the sampling rate switch too: ffmpeg -i BBAN-264.mp4 -vn -ar 16000 -acodec copy BBAN-264.aac . In Whisper discussion forum there are few posts about how sensitive the model is to the sampling rate.
 
  • Like
Reactions: Taako

mei2

Well-Known Member
Dec 6, 2018
246
405
New Year present for fans of subtitles: here is my entire archive of subs from iKoa:

EDIT: new link [as the link in the original post was deleted]:



Kudos goes to admin and bot developer of iKoa. I've attached iKoa's official list which was updated on Nov. 9. However, the archive includes later subs from Nov and Dec as well. There are 8,500+ subs in total.

Happy New Year everyone!

-
 

Attachments

  • 000_List_of_iKoa_subs_Nov_9_2022.zip
    456 KB · Views: 643
Last edited:

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,737
5,131
@SamKook , I'd suggest to add the sampling rate switch too: ffmpeg -i BBAN-264.mp4 -vn -ar 16000 -acodec copy BBAN-264.aac . In Whisper discussion forum there are few posts about how sensitive the model is to the sampling rate.
That's not going to do anything, it'll just get ignored and keep the original sampling rate, which for the vast majority of audio is 48k or 44.1k(don't think I've seen a different one than those 2 in a movie or music/audio file unless it's a weird file).
It's almost certainly not going to be 16k from any audio source so it wouldn't make sense for that to be what's needed, unless whisper is converting it to that internally.

I haven't done any research on this and don't really use whisper or looked at its code so I have no clue about this but it would be a very odd choice to require 16k on the source for best results.


Edit: So, I checked the code and as I suspected, it's changing the sample rate internally to 16k during the decoding phase so the source sample rate doesn't matter.

Here's the code portion that does it. The "sr" value for ar is a variable set to 16000.
Code:
# This launches a subprocess to decode audio while down-mixing and resampling as necessary.
# Requires the ffmpeg CLI and `ffmpeg-python` package to be installed.
out, _ = (
    ffmpeg.input(file, threads=0)
    .output("-", format="s16le", acodec="pcm_s16le", ac=1, ar=sr)
    .run(cmd=["ffmpeg", "-nostdin"], capture_stdout=True, capture_stderr=True)
)

Edit2: Here's the size of the output files for the test movie I used to give you an idea of why resampling it manually is just wasting space most of the time:
Original aac(192kbps): 249.35MB
Uncompressed wav: 1.92GB
Mei2/quay2 16k(stereo): 655.96MB
Whisper 16k(mono): 327.98MB

So unless you start with a not very compressed audio file that's above 256kbps(that's what the whisper processing ends up being), you're just ending up with a bigger file and you want to avoid using an extra lossy compression step to not degrade the audio quality so you have to use something like wav for the output if you process the original.
 
Last edited:
  • Like
Reactions: Taako and mei2

arm4n

Member
Aug 20, 2009
91
88
ROE-116 Mother, Son And Nephew. Abnormal Triangle Relationship Incest Rieko Hiraoka Competing For Married Woman Rieko With Jealous Meat Stick

roe116pl.jpg


Translated from chinese version and cleaned most of all the lines. Enjoy and happy new year everyone!
 

Attachments

  • ROE-116.zip
    19 KB · Views: 403

Taako

Akiba Citizen
May 25, 2017
1,332
938
That's not going to do anything, it'll just get ignored and keep the original sampling rate, which for the vast majority of audio is 48k or 44.1k(don't think I've seen a different one than those 2 in a movie or music/audio file unless it's a weird file).
It's almost certainly not going to be 16k from any audio source so it wouldn't make sense for that to be what's needed, unless whisper is converting it to that internally.

I haven't done any research on this and don't really use whisper or looked at its code so I have no clue about this but it would be a very odd choice to require 16k on the source for best results.


Edit: So, I checked the code and as I suspected, it's changing the sample rate internally to 16k during the decoding phase so the source sample rate doesn't matter.

Here's the code portion that does it. The "sr" value for ar is a variable set to 16000.
Code:
# This launches a subprocess to decode audio while down-mixing and resampling as necessary.
# Requires the ffmpeg CLI and `ffmpeg-python` package to be installed.
out, _ = (
    ffmpeg.input(file, threads=0)
    .output("-", format="s16le", acodec="pcm_s16le", ac=1, ar=sr)
    .run(cmd=["ffmpeg", "-nostdin"], capture_stdout=True, capture_stderr=True)
)

Edit2: Here's the size of the output files for the test movie I used to give you an idea of why resampling it manually is just wasting space most of the time:
Original aac(192kbps): 249.35MB
Uncompressed wav: 1.92GB
Mei2/quay2 16k(stereo): 655.96MB
Whisper 16k(mono): 327.98MB

So unless you start with a not very compressed audio file that's above 256kbps(that's what the whisper processing ends up being), you're just ending up with a bigger file and you want to avoid using an extra lossy compression step to not degrade the audio quality so you have to use something like wav for the output if you process the original.
And this is why the site appreciates you @SamKook.
Always going the extra mile to help us:elegan:
 

maelstrom9999

Well-Known Member
Apr 26, 2022
480
417
The web version of Whisper is severely limited in GPU utilization time. Since my initial success which was about 36 hours ago, it still won't let me do it again.

Wondering if extracting a separate audio file from the video file would reduce GPU utilization. Going to try it.
 
  • Like
Reactions: Taako

quay2

Active Member
Nov 28, 2009
96
177
There's no benefit to using wav instead of just copying the audio and wav files will be 10-11 times bigger on average so that means if you have to upload it, it'll take that much longer.

I'd just do something like this instead after checking what the audio type is with mediainfo to set the extension to the right thing:
Code:
ffmpeg -i BBAN-264.mp4 -vn -acodec copy BBAN-264.aac
This is for doing the Whisper part yourself, since online doesn't work anywhere for long audio, and barely works for short audio. And Whisper only accepts 16kHz, so just copying the audio won't work.
 

quay2

Active Member
Nov 28, 2009
96
177
New Year present for fans of subtitles: here is my entire archive of subs from iKoa:


Kudos goes to admin and bot developer of iKoa. I've attached iKoa's official list which was updated on Nov. 9. However, the archive includes later subs from Nov and Dec as well. There are 8,500+ subs in total.

Happy New Year everyone!

-
Thanks. The archive seems to be missing some subs. First I checked was the MIGD subs, and only one of the three was in the archive.
 
  • Wow
  • Like
Reactions: Imscully and Taako