Post your JAV subtitle files here - JAV Subtitle Repository (JSP)★NOT A SUB REQUEST THREAD★

Imscully

Well-Known Member
Apr 1, 2014
359
636
ROE-116 Mother, Son And Nephew. Abnormal Triangle Relationship Incest Rieko Hiraoka Competing For Married Woman Rieko With Jealous Meat Stick

roe116pl.jpg


Translated from chinese version and cleaned most of all the lines. Enjoy and happy new year everyone!
Thanks for this. Although she has tiny boobs (I'm the type of guy who normally believes a D-cup is a starter kit), Reiko's expressive eyes and seemingly submissive nature really makes her hot.
 

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,737
5,131
This is for doing the Whisper part yourself, since online doesn't work anywhere for long audio, and barely works for short audio. And Whisper only accepts 16kHz, so just copying the audio won't work.

No, it does work, I tested this on the colab with a 3 hour 48kHz audio file and it worked perfectly.

The whisper code will convert it to 16kHz internally, that won't change whether it's local or remote.
Edit: So, I checked the code and as I suspected, it's changing the sample rate internally to 16k during the decoding phase so the source sample rate doesn't matter.

Here's the code portion that does it. The "sr" value for ar is a variable set to 16000.
Code:
# This launches a subprocess to decode audio while down-mixing and resampling as necessary.
# Requires the ffmpeg CLI and `ffmpeg-python` package to be installed.
out, _ = (
    ffmpeg.input(file, threads=0)
    .output("-", format="s16le", acodec="pcm_s16le", ac=1, ar=sr)
    .run(cmd=["ffmpeg", "-nostdin"], capture_stdout=True, capture_stderr=True)
)

Edit2: Here's the size of the output files for the test movie I used to give you an idea of why resampling it manually is just wasting space most of the time:
Original aac(192kbps): 249.35MB
Uncompressed wav: 1.92GB
Mei2/quay2 16k(stereo): 655.96MB
Whisper 16k(mono): 327.98MB

So unless you start with a not very compressed audio file that's above 256kbps(that's what the whisper processing ends up being), you're just ending up with a bigger file and you want to avoid using an extra lossy compression step to not degrade the audio quality so you have to use something like wav for the output if you process the original.
 
Last edited:
  • Like
Reactions: Taako and Imscully

mei2

Well-Known Member
Dec 6, 2018
246
405
Thanks. The archive seems to be missing some subs. First I checked was the MIGD subs, and only one of the three was in the archive.

If you send me the missing ones I'll try to fetch them and add them in.
 

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,737
5,131
In case there were any doubts that something other than 16kHz worked with the online version, I did a test just now with a 3h52m video:
Colab_4h_test.jpg
I also included the resulting srt sub with no editing.
 

Attachments

  • BBAN-313.zip
    34.3 KB · Views: 180
  • Like
Reactions: Taako

quay2

Active Member
Nov 28, 2009
96
177
No, it does work, I tested this on the colab with a 3 hour 48kHz audio file and it worked perfectly.

The whisper code will convert it to 16kHz internally, that won't change whether it's local or remote.
Can you send the URL of the service where you can do long subs online?
Also, Whisper.cpp gives an error message if you give it non 16kHz audio, but maybe the Python version is different?
 

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,737
5,131
Can you send the URL of the service where you can do long subs online?
I purposefully included the address bar in the screenshot from the post I made above yours so people would know, but here it is anyway, it's the WhisperWithVAD link people keep posting here: https://colab.research.google.com/github/ANonEntity/WhisperWithVAD/blob/main/WhisperWithVAD.ipynb

Also, Whisper.cpp gives an error message if you give it non 16kHz audio, but maybe the Python version is different?
And yeah, seems to be that way looking at the github page for the cpp port someone made, it probably doesn't include the internal ffmpeg processing yet that the python version has:
Code:
Note that the main example currently runs only with 16-bit WAV files, so make sure to convert your input before running the tool. For example, you can use ffmpeg like this:
ffmpeg -i input.mp3 -ar 16000 -ac 1 -c:a pcm_s16le output.wav
You should also make it mono(-ac 1) unlike your previous example or else you're getting a file that's double the size for nothing.
 
  • Like
Reactions: Taako and quay2

theydonotwantto

Active Member
Aug 10, 2011
227
181
New Year present for fans of subtitles: here is my entire archive of subs from iKoa:


Kudos goes to admin and bot developer of iKoa. I've attached iKoa's official list which was updated on Nov. 9. However, the archive includes later subs from Nov and Dec as well. There are 8,500+ subs in total.

Happy New Year everyone!

-
the file has been deleted. Please upload again.
 

Electromog

Akiba Citizen
Dec 7, 2009
4,634
2,849
I tried whisper but after several minutes of doing something (tghere was a progress bar and everything) it stopped with the error file not found, without saying which file it couldn't find. Can't imagine it was the audio file because it had been working on *something* for several minutes.

I guess I will try again when at some point I get a new gaming computer as it will have a more powerful GPU with more memory.
 

kadal123

New Member
Jan 20, 2020
8
7
New Year present for fans of subtitles: here is my entire archive of subs from iKoa:


Kudos goes to admin and bot developer of iKoa. I've attached iKoa's official list which was updated on Nov. 9. However, the archive includes later subs from Nov and Dec as well. There are 8,500+ subs in total.

Happy New Year everyone!

-
please reupload mate, link is expired...
 
  • Like
Reactions: Imscully

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,737
5,131
I tried whisper but after several minutes of doing something (tghere was a progress bar and everything) it stopped with the error file not found, without saying which file it couldn't find. Can't imagine it was the audio file because it had been working on *something* for several minutes.

I guess I will try again when at some point I get a new gaming computer as it will have a more powerful GPU with more memory.

Can't help you with so little information.

When you first use a model, it has to download it and it's a pretty big file so that's probably what it was working on for those several minutes you saw.
 

Aikayikes!

Well-Known Member
Apr 26, 2020
93
277
Can't help you with so little information.

When you first use a model, it has to download it and it's a pretty big file so that's probably what it was working on for those several minutes you saw.
Well I finally figured out what my problem was for the sub to stop partway through the audio file. I feel really dumb, but will tell everyone anyway - maybe someone else won't do the same thing.

I simply hadn't waited long enough for the file to upload. I thought that as soon as I saw it up top it was good. Nope!! I just completed BBAN-132 and got 2700 lines from it. Whisper is an amazing tool! Thanks everyone for your patience trying to help me.
 
  • Like
Reactions: Prinsipe and mei2

maelstrom9999

Well-Known Member
Apr 26, 2022
480
417
Well I finally figured out what my problem was for the sub to stop partway through the audio file. I feel really dumb, but will tell everyone anyway - maybe someone else won't do the same thing.

I simply hadn't waited long enough for the file to upload. I thought that as soon as I saw it up top it was good. Nope!! I just completed BBAN-132 and got 2700 lines from it. Whisper is an amazing tool! Thanks everyone for your patience trying to help me.

Yes, you must wait for it to fully upload to 100%. Did you use the full video file or did you extract and save a separate audio file?
 

Aikayikes!

Well-Known Member
Apr 26, 2020
93
277
Yes, you must wait for it to fully upload to 100%. Did you use the full video file or did you extract and save a separate audio file?
Converted the video to audio. I'm not sure which would be better - WAV or MP3. I haven't reviewed it all the way, but I don't see a lot of double-ups in the srt after a quick look-through.
 

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,737
5,131
Best would be to extract/demux the original audio rather than convert it. There's many tools and tutorial about it on the net if you use the term demux and the extension of your video file.

It also has been discussed many times in the last few pages here.
 
  • Like
Reactions: Taako

maelstrom9999

Well-Known Member
Apr 26, 2022
480
417
Converted the video to audio. I'm not sure which would be better - WAV or MP3. I haven't reviewed it all the way, but I don't see a lot of double-ups in the srt after a quick look-through.

Yeah, I discovered that since it counts download time toward your "gpu utilization limit" that it times me out, then makes me wait 24 hours, almost every time I try to use the full video file. Now that I'm using the separate audio file I think I can do 2x per day. I just use vlc to do it.
 

Chuckie100

Well-Known Member
Sep 13, 2019
710
2,768
Happy New Year all!

I'm still struggling with Whisper. I did actually get the collab version (although I would prefer a stand alone version but I'm not a coder) to run but the on-screen transcription was in Japanese (I think) not English, After about 30 mins into the movie's audio file, the transcription keep repeating ( I think I should have specified Large instead of the Medium) , and I couldn't figure out how to download the output.; however there were no error messages. I'm hoping some of you Whisper gurus might have an answer of these problems. My apologies for annoying the coders out there. Anyway any help would be appreciated.
 
Last edited:
  • Like
Reactions: Taako

quay2

Active Member
Nov 28, 2009
96
177
Happy New Year all!

I'm still struggling with Whisper. I did actually get the collab version (although I would prefer a stand alone version but I'm not a coder) to run but the on-screen transcription was in Japanese (I think) not English, After about 30 mins into the movie's audio file, the transcription keep repeating ( I think I should have specified Large instead of the Medium) , and I couldn't figure out how to download the output.; however there were no error messages. I'm hoping some of you Whisper gurus might have an answer of these problems. My apologies for annoying the coders out there. Anyway any help would be appreciated.
The repetition problem is well-known. Apparently using the VAD helps, but that will make things more complicated. One way to circumvent it would be to use Subtitle Edit and tell it to split up the audio to where it think sentences begin, but that gave me different problems.
To get the more-or-less standalone version, you can download it via Subtitle Edit. I recommend replacing the files with the beta version: https://github.com/SubtitleEdit/subtitleedit/releases/download/3.6.10/SubtitleEditBeta.zip

This file constantly changes to include the latest bug fixes, btw. The URL stays the same. You can also replace the main.exe with the latest AVX2 version here: https://github.com/SubtitleEdit/support-files/tree/master/whisper
 
  • Like
Reactions: mei2 and Chuckie100