I've had no luck specifying --suppress_tokens as just comma separated words - whisper just refuses to run. I think the words have to be "tokenized" via some other process and turned into integers. If anyone has better luck getting that working, would love to know how to do it That said, I did have some success using --initial_prompt - my transcripts are ever so slighter naughtier
Code:
-initial_prompt 'cock pussy dick "deep throat" throbbing throb shaking cum come slut slick coming throat wet good very load balls lick ikku kimochi clitoris clit orgasm suck tits boobs breasts huge masturbate "fuck buddy" zubozubo hard binbin cunt fuck fucking ass crack lips' \
Have you tested with Japanese signs also/only (like なめて ちんぽ やりまん ぬれて)? I wonder if that would work better since it's transcribing, or is this flying over my head, and English is the way to go?
I too am using a program to translate with deepl (deepl-srt) and it recently broke, likely due to the same changes to deepl site. I tried to make the suggested changes to he program but couldn't' find the lines, perhaps it's a different program?
Either way, would you mind posting the corrected version of your program so we can all use i?:
Thanks.
Not OP. I worked a lot with ChatGPT on it, and after a lot issues etc. I got it to work perfectly once, then on the next subtitle, the result broke, so "my" version is set on hold. ChatGPT + Bing will probably solve since it can use the Internet and implement other things from github that might help.
In the meantime, I'm using this script that I got ChatGPT to write.
It extracts the dialogue of the subtitle (only really supports .srt because the way it extract it) into it's own text file, line for line. Then I just copy/paste the dialogue into deepl. Then I paste the translation into it's own document. (I also use another script that replaces certain words an dsave it into another text file.) Then I tell the script which file to use as translation, and it will create a new .srt file with the original timings and the translated dialogue
It might be a bit cumbersome with the amount of extra files, but it gives complete control to you, And it's an easy script, so it should be easy to change it how you want it (or get ChatGPT to do it
)
I use Notepad++, so it's simple to see length (or line number where you have to continue if you went over 5000), and it has the "Compare" plugin which I highly recommend,