Post your JAV subtitle files here - JAV Subtitle Repository (JSP)★NOT A SUB REQUEST THREAD★

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,737
5,131
You can test it yourself by running whisper twice on the same file, once with translation and once without.
 

porgate55555

Active Member
Jul 24, 2021
51
163
You can test it yourself by running whisper twice on the same file, once with translation and once without.
This will not work as Whisper generates different subtiltes every time you run it, so there is no one-to-one comparison unless you get the japanese version with no translation and the translated one from the same run.
 

mei2

Well-Known Member
Dec 6, 2018
246
405
I know how it workes, the question is if you get better results using whisper or deepl.


The good things with Whisper end-to-end translation are:
(a) It uses context for translation. It tries to build a context for example guessing gender (he, she), and punctuations for translation task.;​
(b) It makes the entire Whisper output faster. Translate tsak is faster than transcribe task. It is funny but their main sw engineer was saying that the way the algorithm is written, the end-to-end trasnlation task is performed faster than just transcribe task :)
The good things with DeepL is that It is just a better translator. Fullstop. One bad thing with DeepL is that it often mixes up he/she, it/they, sir/ma'am.

For me I decided to just stick with DeepL. I did some comparisons during the early days of Whisper (v1). I haven't done any comparison with v2 but I understand that the translation capability did not change from v1 to v2. To me, DeepL translations came out better. But then again, I don't speek Japanese so my read might be quite wrong.

In terms of being able to compare the outputs as @SamKook suggested, one can make Whisper to be more deterministic by setting both temperature and beam to zero. That makes the output close to determinstic. But the pitfal is that it produces more halucination and repeating lines in the output.
 
  • Like
Reactions: porgate55555

porgate55555

Active Member
Jul 24, 2021
51
163
The good things with Whisper end-to-end translation are:
(a) It uses context for translation. It tries to build a context for example guessing gender (he, she), and punctuations for translation task.;​
(b) It makes the entire Whisper output faster. Translate tsak is faster than transcribe task. It is funny but their main sw engineer was saying that the way the algorithm is written, the end-to-end trasnlation task is performed faster than just transcribe task :)
The good things with DeepL is that It is just a better translator. Fullstop. One bad thing with DeepL is that it often mixes up he/she, it/they, sir/ma'am.

For me I decided to just stick with DeepL. I did some comparisons during the early days of Whisper (v1). I haven't done any comparison with v2 but I understand that the translation capability did not change from v1 to v2. To me, DeepL translations came out better. But then again, I don't speek Japanese so my read might be quite wrong.

In terms of being able to compare the outputs as @SamKook suggested, one can make Whisper to be more deterministic by setting both temperature and beam to zero. That makes the output close to determinstic. But the pitfal is that it produces more halucination and repeating lines in the output.
I did one file today with only transcribe and it took nearly 1h 30min instad of approx. 30min, then ran it through Deepl and I wasn't impressed with the result. Did not seem better than just straight letting whisper do the whole job.
 

panop857

Active Member
Sep 11, 2011
168
237
I know how it workes, the question is if you get better results using whisper or deepl.
DeepL is theoretically better, but there's probably some value in doing direct-to-English with the same deep learning model rather than taking the transcribed output and feeding into a second deep learning model that isn't specifically designed to work interact with the first. There's just an additional loss of information during that intermediate step.

It also depends on whether you are using Medium or Large Whisper, and how tuned your parameters are. Some things like an increased Beam Size are going to produce better translations of proper nouns, and Large is just generally better if you can pull it off.
 

porgate55555

Active Member
Jul 24, 2021
51
163
DeepL is theoretically better, but there's probably some value in doing direct-to-English with the same deep learning model rather than taking the transcribed output and feeding into a second deep learning model that isn't specifically designed to work interact with the first. There's just an additional loss of information during that intermediate step.

It also depends on whether you are using Medium or Large Whisper, and how tuned your parameters are. Some things like an increased Beam Size are going to produce better translations of proper nouns, and Large is just generally better if you can pull it off.
I'm using standard settings and the large model from the collab posted here very early on. (VAD threshold 0.4 and chunk_thershold 0.3)
 

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,737
5,131
This will not work as Whisper generates different subtiltes every time you run it, so there is no one-to-one comparison unless you get the japanese version with no translation and the translated one from the same run.

Sure, it won't be perfect for the whole video but for part of it it can work. For example, on the SSIS-381 test I ran 4 times, the first 4 min 27 sec never changed, it was always identical. So you can run it a few times, find a range that doesn't change using kdiff3 or something similar to compare the files and then get a transcription of that.
 
Last edited:
  • Like
Reactions: Prinsipe and mei2

Chuckie100

Well-Known Member
Sep 13, 2019
710
2,768
I love Whisper colab. It is rare that I find a JAV video after 2020 that I truly feel compelled to fork over the $$$ to have professionally subtitled. But Whisper is allowing me to subtitle (with cleaning and some re-interpretation) many of the oldies (not subtitled) in my collection that I feel are far more erotic/superior to the recent entries. Yeah, Whisper is not perfect but it allows a better understanding of the storyline which is mainly what I am after. However my Multi-Terabyte hard drives are being stressed! lol
 

panop857

Active Member
Sep 11, 2011
168
237
Do some parameters get better timings? There's the known issue with not getting good timing within the first 30 second window, but there's some pretty bad mismatches all through some transcripts while others are great.
 

Zephlol

Member
Oct 2, 2022
20
36
I u

For whatever it's worth, I use DeepL as a translation tool myself, mostly to quickly parse complex sentences and brainstorm translation choices and sentence structures. I wouldn't recommend anybody use it for translation tasks that actually matter, though, unless they have a solid grasp of the target language. DeepL is very good at hiding what it in fact cannot understand, as it prioritizes natural-sounding language above accuracy. And it will straight up ignore details or change the basic meaning to do so.
Thats my gripe with using deepL. It will convert the grammar of the original language to english grammar, but it will ignore words/phrases that it doesnt understand. Google API will translate as much as it can, sacrificing grammar for more accurate vocabulary. That said, for AI translation, it is a lot easier to follow sub made from deepl than google
 
  • Like
Reactions: mei2 and Prinsipe

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,737
5,131
Anyone got a good SRT of SSNI-800?
You do realize that 5 of your 6 total posts are you requesting subs in a thread that has "★NOT A SUB REQUEST THREAD★" in its title.
 

Prinsipe

Member
Aug 31, 2013
58
19
Through torrent. Youd have to get the multi-language bundle to be able to detect japanese
Who is the source of your adobe? Monkrus? I am searching for a virus free adobe software but all of the source that i know has a negative reviews on reddit. :kecewa:
 

maload

Active Member
Jul 1, 2008
695
144
Anyone got a good SRT of SSNI-800?

just ask the member name " darksider "

just pm him, not here

and good thing or things cant be free
 
Last edited:
  • Like
Reactions: Malivnr007

noph2

New Member
Aug 11, 2008
2
2
Hi all, I'm having an issue with the subtitle for JUX-579. So when I added it to the video, it came out like this:

Screen Shot 2023-02-17 at 3.56.02 PM.png

Can someone help me with this? What should I do to fix it?

Thank you!!
 

Attachments

  • jux00579pl.jpg
    jux00579pl.jpg
    170.2 KB · Views: 125
  • JUX-579.srt.zip
    5.6 KB · Views: 156
  • Like
Reactions: xsf27 and noolek