Post your JAV subtitle files here - JAV Subtitle Repository (JSP)★NOT A SUB REQUEST THREAD★

hury

New Member
Aug 5, 2015
9
15
I've had no luck specifying --suppress_tokens as just comma separated words - whisper just refuses to run. I think the words have to be "tokenized" via some other process and turned into integers. If anyone has better luck getting that working, would love to know how to do it That said, I did have some success using --initial_prompt - my transcripts are ever so slighter naughtier

Code:
-initial_prompt 'cock pussy dick "deep throat" throbbing throb shaking cum come slut slick coming throat wet good very load balls lick ikku kimochi clitoris clit orgasm suck tits boobs breasts huge masturbate "fuck buddy" zubozubo hard binbin cunt fuck fucking ass crack lips' \
Have you tested with Japanese signs also/only (like なめて ちんぽ やりまん ぬれて)? I wonder if that would work better since it's transcribing, or is this flying over my head, and English is the way to go?

I too am using a program to translate with deepl (deepl-srt) and it recently broke, likely due to the same changes to deepl site. I tried to make the suggested changes to he program but couldn't' find the lines, perhaps it's a different program?
Either way, would you mind posting the corrected version of your program so we can all use i?:

Thanks.
Not OP. I worked a lot with ChatGPT on it, and after a lot issues etc. I got it to work perfectly once, then on the next subtitle, the result broke, so "my" version is set on hold. ChatGPT + Bing will probably solve since it can use the Internet and implement other things from github that might help.

In the meantime, I'm using this script that I got ChatGPT to write.

It extracts the dialogue of the subtitle (only really supports .srt because the way it extract it) into it's own text file, line for line. Then I just copy/paste the dialogue into deepl. Then I paste the translation into it's own document. (I also use another script that replaces certain words an dsave it into another text file.) Then I tell the script which file to use as translation, and it will create a new .srt file with the original timings and the translated dialogue

It might be a bit cumbersome with the amount of extra files, but it gives complete control to you, And it's an easy script, so it should be easy to change it how you want it (or get ChatGPT to do it :))
I use Notepad++, so it's simple to see length (or line number where you have to continue if you went over 5000), and it has the "Compare" plugin which I highly recommend,
 

Attachments

  • dialogue_extract_insert.zip
    682 bytes · Views: 96
  • Capture.jpg
    Capture.jpg
    69.4 KB · Views: 58
  • Like
Reactions: mei2 and r00g

Zephlol

Member
Oct 2, 2022
20
36
It would take more time to check and fix auto-generated subtitles than it would to create them from scratch.
Again, I’m only interested in producing high quality fansubs. I don’t need AI to do so, but I do need human volunteers to divide the work.
I thought you were only looking for volunteer to do subtitle timing? Thats what those methods are for. AI sub is shit, but their timing is perfect
 

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,755
5,154
AI sub is shit, but their timing is perfect

Their timing is shit too or at least whisper timing are, haven't looked at too many others.

Usually you want just a little bit of lead in, respect scene changes as much as possible and don't display the subs for too long or too short. What whisper does is pretty much display all subs for the same amount of time and mostly start at the second and hold that pattern(it does stray a bit over time).
Might seems fine if you're not difficult but after timing anime for years, I personally can't stand it, I'll notice even if it's just one frame off.
 
  • Like
Reactions: Imscully

panop857

Active Member
Sep 11, 2011
171
242
For some reason many subs generated via Whisper will have a bunch of subtitles start exactly at 30 seconds, and totally rushed through and offset. Most of the time the timing is great, but I've seen this initial rush a few times and I can't tell how/why it is happening like this.

For the most part, the timings are pretty good compared to most JAV subtitles I've seen. They aren't anime quality, but for stuff that is automatically generated they seem pretty good.

I think I'm in a spot where I can write a Whisper intro thread, but I think I need to understand logprob_threshold. I understand the no_speech_threshold mechanic just fine, but logprob_threshold I do not have a good sense for. Also, compression_ratio_threshold is confusing as well and I don't know when to adjust it higher or lower than the default 2.4.

For condition_on_previous_text I think I have settled on just setting it to False and keeping it there. You'll get scattered totally left field lines but they can be deleted or replaced in the edit. The upside is it gets stuck in loops way less often, and will be more willing to resort to descriptive emotes like "(moaning)" or "(crying)" "
(Heavy breathing)
(gagging)
*panting*

When you are conditioning to previous text, it will fit to a specific style of transcriptions, but with it set to False, if it sounds like some woman is gagging on something, it will reference some subtitle in those 680k hours of data where some woman sounds like she's gagging and will use the corresponding subtitle that noted (gagging).
 
Last edited:

panop857

Active Member
Sep 11, 2011
171
242

Zephlol

Member
Oct 2, 2022
20
36
Their timing is shit too or at least whisper timing are, haven't looked at too many others.

Usually you want just a little bit of lead in, respect scene changes as much as possible and don't display the subs for too long or too short. What whisper does is pretty much display all subs for the same amount of time and mostly start at the second and hold that pattern(it does stray a bit over time).
Might seems fine if you're not difficult but after timing anime for years, I personally can't stand it, I'll notice even if it's just one frame off.
Havent used whisper yet, but ive made 50+ srts using adobe premiere/ youtube as starting template for timings and it has been bang on everytime with the occasional line or 2 staying on for too long. Adobe has been awesome at capturing dialogue while ignoring white noises. Youtube not so much as it has a lot of subtitles like (music) or (applause) but they can be easily deleted.
 

Prinsipe

Member
Aug 31, 2013
58
19
Havent used whisper yet, but ive made 50+ srts using adobe premiere/ youtube as starting template for timings and it has been bang on everytime with the occasional line or 2 staying on for too long. Adobe has been awesome at capturing dialogue while ignoring white noises. Youtube not so much as it has a lot of subtitles like (music) or (applause) but they can be easily deleted.
Can you post an example of transcription of adobe premiere using the japanese language that is already translated in English. I think you should use Whisper too to compare their results. Because Whisper is the best by far in my opinion in terms of machine transcription.

If adobe premiere produces transcription that is comparably as good as Whisper then it will be an another application that will help jav subbers in creating their subtitle more accurate.

Thank you very much. :)
 

Zephlol

Member
Oct 2, 2022
20
36
Can you post an example of transcription of adobe premiere using the japanese language that is already translated in English. I think you should use Whisper too to compare their results. Because Whisper is the best by far in my opinion in terms of machine transcription.

If adobe premiere produces transcription that is comparably as good as Whisper then it will be an another application that will help jav subbers in creating their subtitle more accurate.

Thank you very much. :)
This is the srt for RKI-606. Its my most recent completed raw srt. Ran the video through premiere on japanese detection and translated to english using subtitle edit. Zero touch-up. Can someone with whisper run the same video and post it here? Im curious how they compare.
Link to the video
 
Last edited:

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,755
5,154
Can someone with whisper run the same video and post it here? Im curious how they compare.
Link to the video
Did you use the 4.8GB download or the 1GB or so stream to make the subs?

It could make a difference and I'm unable to get the downloaded version since they don't provide free links. I am downloading the stream though.
 

Zephlol

Member
Oct 2, 2022
20
36
Did you use the 4.8GB download or the 1GB or so stream to make the subs?

It could make a difference and I'm unable to get the downloaded version since they don't provide free links. I am downloading the stream though.
The 4.8gb. Are you going to run it through whisper? If so. Wait a few til i get home and i can upload the 4.8gb version to gdrive for a better comparison
 
Last edited:

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,755
5,154
The 4.8gb. Are you going to run it through whisper? If so. Wait a few til i get home and i can upload the 4.8gb version to gdrive for a better comparison
It will probably take me roughly 24h until I can do that since I'm going to sleep soon and then work, but if nobody else does it, I will. I found a torrent for the 4.8GB version and started it so I might be good to get it, shows at least one seed so good sign.


On a slightly different note, I found the old timing tutorial I made to teach new members back when I was doing anime fansubbing so if anyone is interested, I made a post in the tutorial section with it unedited so some stuff might be irrelevant or outdated but the principle is still good: https://www.akiba-online.com/thread...ike-an-anime-fansubber-using-aegisub.2114315/
 

Zephlol

Member
Oct 2, 2022
20
36
It will probably take me roughly 24h until I can do that since I'm going to sleep soon and then work, but if nobody else does it, I will. I found a torrent for the 4.8GB version and started it so I might be good to get it, shows at least one seed so good sign.


On a slightly different note, I found the old timing tutorial I made to teach new members back when I was doing anime fansubbing so if anyone is interested, I made a post in the tutorial section with it unedited so some stuff might be irrelevant or outdated but the principle is still good: https://www.akiba-online.com/thread...ike-an-anime-fansubber-using-aegisub.2114315/
Im in no hurry. Ill upload it to this post when i get home. Will be a lot faster than torrent i reckon.

edit: Google Drive for RKI-606 I'll keep the link for a few days only.
 
Last edited:

mei2

Well-Known Member
Dec 6, 2018
246
406
It is that time of the month: Queen Minami time :)


View attachment 3160182

IPX-998 Teacher...Can I Stay The Night? During The Training Period, I, Who Lives In A Hotel, Was Forced By A Student To Share A Room With Me. Minami Aizawa





Like many of you here I have been experimenting with various workflows, and parameters of Whisper and I think I have come up with something that creates decent subs. If any one here speaks Japanese, please take a look at this one and review --any spot check of accuracy and quality will be helpful. I appreciate it. I plan to write up my workflow once I am more sure of the quality.

Thanks in advance.

PS. How does one links Javlibrary here with large screenshots?
 

Attachments

  • IPX-998.en Minami Aizawa.srt.zip
    26.7 KB · Views: 279
  • ipx998pl.jpg
    ipx998pl.jpg
    93 KB · Views: 72

Zephlol

Member
Oct 2, 2022
20
36
ipx998pl.jpg


dont make it as an attachment. copy and paste the cover as is.

On a side note. I don't know how you can stand the begging chooser requests @ scanlover, they bug me a lot.
 
  • Like
Reactions: mei2

mei2

Well-Known Member
Dec 6, 2018
246
406
On a side note. I don't know how you can stand the begging chooser requests @ scanlover, they bug me a lot.


Yeah, it gets annoying some time. There are quite a few of them there, aren't they? But there are also few genuine contrubutors too.
 

superman4207

JAV Perv Enthusiast
Jul 4, 2022
49
88
Im in no hurry. Ill upload it to this post when i get home. Will be a lot faster than torrent i reckon.

edit: Google Drive for RKI-606 I'll keep the link for a few days only.
@Zephlol @SamKook Hey guys, I went ahead and ran RKI-606 (the version from Gdrive) through Whisper. Here's the Whisper version of the sub so you can compare and contrast, Zephol.

Anything for SCIENCE (and JAV)!
 

Attachments

  • [RKI-606].zip
    10 KB · Views: 186

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,755
5,154
dont make it as an attachment. copy and paste the cover as is.

Don't hotlink pictures and do make them as attachments on the forum instead or we end up attracting unwanted attention since hotlinking steals bandwidth from those other website or end up as dead links down the road if that website dies or blocks hotlinking.

When you hover over your attached pictures, there's 2 option on how to insert them into your post and you can simply choose full size(or something like that, not 100% sure on the term from memory) and it'll add it full size where the cursor is in your post.
 
  • Like
Reactions: mei2

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,755
5,154
For some reason many subs generated via Whisper will have a bunch of subtitles start exactly at 30 seconds, and totally rushed through and offset. Most of the time the timing is great, but I've seen this initial rush a few times and I can't tell how/why it is happening like this.

For the most part, the timings are pretty good compared to most JAV subtitles I've seen. They aren't anime quality, but for stuff that is automatically generated they seem pretty good.

@Zephlol @SamKook Hey guys, I went ahead and ran RKI-606 (the version from Gdrive) through Whisper. Here's the Whisper version of the sub so you can compare and contrast, Zephol.

Anything for SCIENCE (and JAV)!

This is the srt for RKI-606. Its my most recent completed raw srt. Ran the video through premiere on japanese detection and translated to english using subtitle edit. Zero touch-up. Can someone with whisper run the same video and post it here? Im curious how they compare.
Link to the video

If you look at the subs generated by whisper for RKI-606, you'll see that they almost always last for exactly a few full secs(the milliseconds for the beginning and end of the line is almost always the same) and the starting millisecond will be the same across many lines until it changes and that repeats.

There's no way that's even remotely accurate to what's being said in any video this way, just having that happen once would be a rarity but it's happening all over the file constantly, that's why I call the timing shit. It's in the right general spot but it's not very good at all.

If you look at Zephlol srt made with premiere, you can see the milliseconds are pretty much never the same for the beginning and end of a line and the next line also pretty much never start at the same milliseconds, like normal subtitles would.
I haven't looked closely at it yet to say if it's good or not though but it should be much better than whisper at least at first glance.
 
  • Like
Reactions: mei2