Post your JAV subtitle files here - JAV Subtitle Repository (JSP)★NOT A SUB REQUEST THREAD★

From the screenshot, his video is a mp4 so I don't think mkvtoolnix works for that, at least it didn't before, only for mkv.
Multiple people mentionned some so just pick one. i use mp4box with a separate gui but the new version complicates things so not the easiest to install.

Edit: Googling "demux mp4" will get you plenty of options, but here's a tutorial to do it with vlc which you likely already have: https://bartman88.blogspot.com/2018/04/how-to-extract-or-demux-audio-or-video.html

Edit2: Oh, you can do it with mkvtoolnix, nice. Can't put anything inside an mp4 but you can take it out. Well, not quite take it out but put it inside an mka which is a mkv with only audio so should do the trick.
Yes. Mkvtoolnix works for all my need. I literally don't used anything else lol:)
 
Hey friends,
I need to make a quick rant.
I was just on SubtitleCat and saw that someone absolutely stole my first 100% original work, BKD-97. That was a true labor of love, as that movie has rocked my particularly pervy boat for years. It was my first baby. I'm especially proud of it as when I watched it with a fellow kinky lady friend of mine who doesn't understand a word of Japanese, she actually thought the subtitles I had created were actually "real". When I told her that I had created every word of dialog, she was floored. (Before anyone asks, she's just a very good friend who also happens to share some of my personal kinks. We don't date. Although she gave me a few blowjobs way back when, we don't play sexually anymore. She's just a friend and a damned good one at that. Although we share some similar kink interests, our friendship has gone light years beyond that to the point where we rarely even discuss kink anymore. But I digress.)
So, I'm a little bit pissed right now.
I don't get upset when sites (like New-Jav) remove my name from the subtitles (although it's not fun or polite), but this may be the first time I've seen someone take my work, eliminate my name, and put their own name as the author. What a scumbag thing to do. I've talked about this type of thing before, so I won't go further into it. I'll let it go and get back to work. I just wanted to vent to my fellow JAV subtitle fans.
Here is the link to the file that some asshole named DevilGrrl edited to make it look like he/she was the actual author. I have no recourse, so I won't waste any more thought or energy on it. Just needed to vent.
File Stolen by DevilGrrl.jpg
 
Last edited:
You gotta be careful since extracting(aka demuxing or demultiplexing) audio isn't the same as converting(aka re-encoding) audio to a specific format. Both will give you an audio file on its own as a result(so you can technically say re-encoding it is extracting it too) so the difference might not be obvious to those unfamiliar with how these things work.

The former doesn't change the audio in any way and is basically instantaneous or takes a couples secs to do, it just takes the audio part out of the file container that holds both video and audio and the latter takes that audio part and completely re-encodes it to a different format or even the same to create an entirely new file and can take a few seconds or minutes depending on your pc.

I think VLC can do both of those things though, but the first one is better since re-encoding means quality loss(unless it's to a lossless format like wav, but those files are huge).
Proper audio for Whisper (no filtering):
"C:\Program Files\ffmpeg\bin\ffmpeg.exe" -i "X:\SomeJAVMovie.mp4" -ar 16000 "X:\SomeJAVMovie.wav"
 
  • Like
Reactions: Taako
Hey friends,
I need to make a quick rant.
I was just on SubtitleCat and saw that someone absolutely stole my first 100% original work, BKD-97. That was a true labor of love, as that movie has rocked my particularly pervy boat for years. It was my first baby. I'm especially proud of it as when I watched it with a fellow kinky lady friend of mine who doesn't understand a word of Japanese, she actually thought the subtitles I had created were actually "real". When I told her that I had created every word of dialog, she was floored. (Before anyone asks, she's just a very good friend who also happens to share some of my personal kinks. We don't date. Although she gave me a few blowjobs way back when, we don't play sexually anymore. She's just a friend and a damned good one at that. Although we share some similar kink interests, our friendship has gone light years beyond that to the point where we rarely even discuss kink anymore. But I digress.)
She is indeed a good friend. I have female friends like that. 2 of them are sex buds, but all of them are good. It's funny, because this afternoon I'm treating one to lunch/dinner. We do this always. One buy or treats:)
Anywayyyyyssss....
So, I'm a little bit pissed right now.
I don't get upset when sites (like New-Jav) remove my name from the subtitles (although it's not fun or polite), but this may be the first time I've seen someone take my work, eliminate my name, and put their own name as the author. What a scumbag thing to do. I've talked about this type of thing before, so I won't go further into it. I'll let it go and get back to work. I just wanted to vent to my fellow JAV subtitle fans.
Here is the link to the file that some asshole named DevilGrrl edited to make it look like he/she was the actual author. I have no recourse, so I won't waste any more thought or energy on it. Just needed to vent.
You have the right to be pissed. People can be ass**** no doubt.
And before someone say, " oh well that's to be expected... or whatever. It still doesn't make it right.
As you know, that's why I don't share my full subs anymore unless it's people I trust.

Again you have the right to be piss. Subbing is hard enough, but to actually take someone ideas and put your name on it... is an actual crime called plagiarism.
If you were writing a book/novel/thesis and did an excerpt, legally that person might be liable and could be sued.

I certainly use subs and might clean them...BUT I have NEVER taken credit and always give credit to the original author if I can.
 
  • Like
Reactions: mei2 and Imscully
A very expensive nvidia gamer gpu(RTX 3080 in my case) is what most people would have that can work well enough for it, but just to do ai stuff, you'd buy a datacenter one like the 3 they mention on the colab page.
This what I thought. Thank you for confirming it for me:)

If I do decide to use Whisper in the future, I rather have it install.
 
Proper audio for Whisper (no filtering):
"C:\Program Files\ffmpeg\bin\ffmpeg.exe" -i "X:\SomeJAVMovie.mp4" -ar 16000 "X:\SomeJAVMovie.wav"
There's no benefit to using wav instead of just copying the audio and wav files will be 10-11 times bigger on average so that means if you have to upload it, it'll take that much longer.

I'd just do something like this instead after checking what the audio type is with mediainfo to set the extension to the right thing:
Code:
ffmpeg -i BBAN-264.mp4 -vn -acodec copy BBAN-264.aac
 
  • Like
Reactions: Taako and mei2
Code:
ffmpeg -i BBAN-264.mp4 -vn -acodec copy BBAN-264.aac

@SamKook , I'd suggest to add the sampling rate switch too: ffmpeg -i BBAN-264.mp4 -vn -ar 16000 -acodec copy BBAN-264.aac . In Whisper discussion forum there are few posts about how sensitive the model is to the sampling rate.
 
  • Like
Reactions: Taako
New Year present for fans of subtitles: here is my entire archive of subs from iKoa:

EDIT: new link [as the link in the original post was deleted]:



Kudos goes to admin and bot developer of iKoa. I've attached iKoa's official list which was updated on Nov. 9. However, the archive includes later subs from Nov and Dec as well. There are 8,500+ subs in total.

Happy New Year everyone!

-
 

Attachments

Last edited:
@SamKook , I'd suggest to add the sampling rate switch too: ffmpeg -i BBAN-264.mp4 -vn -ar 16000 -acodec copy BBAN-264.aac . In Whisper discussion forum there are few posts about how sensitive the model is to the sampling rate.
That's not going to do anything, it'll just get ignored and keep the original sampling rate, which for the vast majority of audio is 48k or 44.1k(don't think I've seen a different one than those 2 in a movie or music/audio file unless it's a weird file).
It's almost certainly not going to be 16k from any audio source so it wouldn't make sense for that to be what's needed, unless whisper is converting it to that internally.

I haven't done any research on this and don't really use whisper or looked at its code so I have no clue about this but it would be a very odd choice to require 16k on the source for best results.


Edit: So, I checked the code and as I suspected, it's changing the sample rate internally to 16k during the decoding phase so the source sample rate doesn't matter.

Here's the code portion that does it. The "sr" value for ar is a variable set to 16000.
Code:
# This launches a subprocess to decode audio while down-mixing and resampling as necessary.
# Requires the ffmpeg CLI and `ffmpeg-python` package to be installed.
out, _ = (
    ffmpeg.input(file, threads=0)
    .output("-", format="s16le", acodec="pcm_s16le", ac=1, ar=sr)
    .run(cmd=["ffmpeg", "-nostdin"], capture_stdout=True, capture_stderr=True)
)

Edit2: Here's the size of the output files for the test movie I used to give you an idea of why resampling it manually is just wasting space most of the time:
Original aac(192kbps): 249.35MB
Uncompressed wav: 1.92GB
Mei2/quay2 16k(stereo): 655.96MB
Whisper 16k(mono): 327.98MB

So unless you start with a not very compressed audio file that's above 256kbps(that's what the whisper processing ends up being), you're just ending up with a bigger file and you want to avoid using an extra lossy compression step to not degrade the audio quality so you have to use something like wav for the output if you process the original.
 
Last edited:
  • Like
Reactions: Taako and mei2
ROE-116 Mother, Son And Nephew. Abnormal Triangle Relationship Incest Rieko Hiraoka Competing For Married Woman Rieko With Jealous Meat Stick

roe116pl.jpg


Translated from chinese version and cleaned most of all the lines. Enjoy and happy new year everyone!
 

Attachments

That's not going to do anything, it'll just get ignored and keep the original sampling rate, which for the vast majority of audio is 48k or 44.1k(don't think I've seen a different one than those 2 in a movie or music/audio file unless it's a weird file).
It's almost certainly not going to be 16k from any audio source so it wouldn't make sense for that to be what's needed, unless whisper is converting it to that internally.

I haven't done any research on this and don't really use whisper or looked at its code so I have no clue about this but it would be a very odd choice to require 16k on the source for best results.


Edit: So, I checked the code and as I suspected, it's changing the sample rate internally to 16k during the decoding phase so the source sample rate doesn't matter.

Here's the code portion that does it. The "sr" value for ar is a variable set to 16000.
Code:
# This launches a subprocess to decode audio while down-mixing and resampling as necessary.
# Requires the ffmpeg CLI and `ffmpeg-python` package to be installed.
out, _ = (
    ffmpeg.input(file, threads=0)
    .output("-", format="s16le", acodec="pcm_s16le", ac=1, ar=sr)
    .run(cmd=["ffmpeg", "-nostdin"], capture_stdout=True, capture_stderr=True)
)

Edit2: Here's the size of the output files for the test movie I used to give you an idea of why resampling it manually is just wasting space most of the time:
Original aac(192kbps): 249.35MB
Uncompressed wav: 1.92GB
Mei2/quay2 16k(stereo): 655.96MB
Whisper 16k(mono): 327.98MB

So unless you start with a not very compressed audio file that's above 256kbps(that's what the whisper processing ends up being), you're just ending up with a bigger file and you want to avoid using an extra lossy compression step to not degrade the audio quality so you have to use something like wav for the output if you process the original.
And this is why the site appreciates you @SamKook.
Always going the extra mile to help us:elegan:
 
The web version of Whisper is severely limited in GPU utilization time. Since my initial success which was about 36 hours ago, it still won't let me do it again.

Wondering if extracting a separate audio file from the video file would reduce GPU utilization. Going to try it.
 
  • Like
Reactions: Taako
There's no benefit to using wav instead of just copying the audio and wav files will be 10-11 times bigger on average so that means if you have to upload it, it'll take that much longer.

I'd just do something like this instead after checking what the audio type is with mediainfo to set the extension to the right thing:
Code:
ffmpeg -i BBAN-264.mp4 -vn -acodec copy BBAN-264.aac
This is for doing the Whisper part yourself, since online doesn't work anywhere for long audio, and barely works for short audio. And Whisper only accepts 16kHz, so just copying the audio won't work.
 
New Year present for fans of subtitles: here is my entire archive of subs from iKoa:


Kudos goes to admin and bot developer of iKoa. I've attached iKoa's official list which was updated on Nov. 9. However, the archive includes later subs from Nov and Dec as well. There are 8,500+ subs in total.

Happy New Year everyone!

-
Thanks. The archive seems to be missing some subs. First I checked was the MIGD subs, and only one of the three was in the archive.
 
  • Wow
  • Like
Reactions: Imscully and Taako
This is for doing the Whisper part yourself, since online doesn't work anywhere for long audio, and barely works for short audio. And Whisper only accepts 16kHz, so just copying the audio won't work.

No, it does work, I tested this on the colab with a 3 hour 48kHz audio file and it worked perfectly.

The whisper code will convert it to 16kHz internally, that won't change whether it's local or remote.
Edit: So, I checked the code and as I suspected, it's changing the sample rate internally to 16k during the decoding phase so the source sample rate doesn't matter.

Here's the code portion that does it. The "sr" value for ar is a variable set to 16000.
Code:
# This launches a subprocess to decode audio while down-mixing and resampling as necessary.
# Requires the ffmpeg CLI and `ffmpeg-python` package to be installed.
out, _ = (
    ffmpeg.input(file, threads=0)
    .output("-", format="s16le", acodec="pcm_s16le", ac=1, ar=sr)
    .run(cmd=["ffmpeg", "-nostdin"], capture_stdout=True, capture_stderr=True)
)

Edit2: Here's the size of the output files for the test movie I used to give you an idea of why resampling it manually is just wasting space most of the time:
Original aac(192kbps): 249.35MB
Uncompressed wav: 1.92GB
Mei2/quay2 16k(stereo): 655.96MB
Whisper 16k(mono): 327.98MB

So unless you start with a not very compressed audio file that's above 256kbps(that's what the whisper processing ends up being), you're just ending up with a bigger file and you want to avoid using an extra lossy compression step to not degrade the audio quality so you have to use something like wav for the output if you process the original.
 
Last edited:
  • Like
Reactions: Taako and Imscully
Thanks. The archive seems to be missing some subs. First I checked was the MIGD subs, and only one of the three was in the archive.

If you send me the missing ones I'll try to fetch them and add them in.
 
In case there were any doubts that something other than 16kHz worked with the online version, I did a test just now with a 3h52m video:
Colab_4h_test.jpg
I also included the resulting srt sub with no editing.
 

Attachments

  • Like
Reactions: Taako