Hey I am also using Whisper. It is a very promising for Speech-to-Text model so far! I also tried it with Youtube Japanese contents. But with AVs, I happen to see the repeated sequences (mostly during AV's performers are having sex)
I wonder if you've had encounter this problem and if you...