akiba resident JAV subtitlers & subtitle talk★NOT A SUB REQUEST THREAD★

SUNBO

Active Member
Nov 19, 2007
115
87
So OpenAI just released Whisper, their speech to text AI, and the transcription seems pretty decent from what I tested using Google Colab.

View attachment 3045996

It's also relatively simple since it's only one command to do a Japanese -> English transcription.

Here's the Github repo for it: https://github.com/openai/whisper

Here's the Colab notebook for it, just replace the file_location variable with a link to the audio file you want to transcribe: https://colab.research.google.com/drive/1j3-_EF43nUCeIkrzpk_jpamtFZmURYrU?usp=sharing
This whisper thing is actually AMAZING, much better than vosk or pytranscriber.

Noob question: how do I run it offline without uploading an mp3 everytime?

EDIT: Omg!!! I watched it with whisper sub, it's amazing!!!! about 90% accuracy compare to vosk/pytranscriber 50%.
 
Last edited:
  • Like
Reactions: mei2

avatarthe

Well-Known Member
Feb 1, 2008
184
282
I watched the intro with both subtitles. I could see the difference. VOSK breaks up sentences, but detects more words (not as obvious with this video but others). PyTranscriber have longer sentences. When both are machine translated, its a close tie. Some words were translated better with VOSK, some better with pytranscriber. (For example, pytranscriber translated the second son's name as Shinji, Vosk did not catch it. Vosk translated "anime" "mecha" for second son's interest, but pyTranscriber translated "animated?". Pytranscriber translated about his wife left him for young man 8 years ago, Vosk did not translate that part properly. Vosk said something about washing machines, but Pytranscriber completely missed that. Pytranscriber missed out "I'm Taku, the father") I still prefer VOSK though, mainly because it doesn't miss out on words, and ease of use.

Also I have an issue with pytranscriber not sure if any of you have it. When I transcribe anything longer than 15min or so, it will get timed out between 40%-85%, and get stuck there. I had to split the video in order to transcribe.

Okay, I'll give away the dirty little secret of how to get Pytranscriber to work flawlessly and no crash. I've not waned this to get out because if too many people start using it it could stop working...


SO the big secret it.... Use a VPN witha Japanese address when you run Pytranscriber, that it! It will work flawlessly!
 
  • Like
Reactions: mei2 and SUNBO

SUNBO

Active Member
Nov 19, 2007
115
87
Okay, I'll give away the dirty little secret of how to get Pytranscriber to work flawlessly and no crash. I've not waned this to get out because if too many people start using it it could stop working...


SO the big secret it.... Use a VPN witha Japanese address when you run Pytranscriber, that it! It will work flawlessly!
HAHA ok. Anyways, look at my post above. I think you no longer want to use pytranscriber anymore.
 

mei2

Well-Known Member
Dec 6, 2018
247
407
This whisper thing is actually AMAZING, much better than vosk or pytranscriber.

Noob question: how do I run it offline without uploading an mp3 everytime?

EDIT: Omg!!! I watched it with whisper sub, it's amazing!!!! about 90% accuracy compare to vosk/pytranscriber 50%.

@SUNBO would you share the subs from Whisper for MIAA-698.
@avatarthe would you share the subs from pyTranscriber and Vosk for MIAA-698.
This is a very good test case.
Thanks both !
 
  • Like
Reactions: Taako

Non_Entity

New Member
Sep 26, 2022
6
13
A big hurdle towards fine-tuning Whisper (or any other model) is a lack of Japanese training data. OpenAI's dataset has 15,914 hours of JP audio (7054 hours with Japanese transcripts, 8860 with English ones), and even that dwarfs the publicly available ones I'm aware of.

Is anyone interested in collaborating on a dataset? My idea is to take tons of subbed Japanese media, do source separation on the audio to isolate the voices, and segment it using the subtitles. Crunchyroll has subbed 14,000+ hours of anime, so it should be possible to build the world's biggest Japanese -> English dataset with anime alone.
 

SUNBO

Active Member
Nov 19, 2007
115
87
@SUNBO would you share the subs from Whisper for MIAA-698.
@avatarthe would you share the subs from pyTranscriber and Vosk for MIAA-698.
This is a very good test case.
Thanks both !
Hey I actually tested it on another JAV not MIAA-698. I have attached it here. The pytranscriber is only the first 10min. The whisper is only the first 20min, and the last few minutes it mucks up and repeats itself and didn't get translated. The vosk is the full movie.

Unfortunately I cannot get the google colab of whisper to work anymore I get an error, not sure why. Will have to figure out how to run it offline.
 

Attachments

  • DVDMS-184 compare.zip
    49.1 KB · Views: 101

Taako

Akiba Citizen
May 25, 2017
1,363
952
A big hurdle towards fine-tuning Whisper (or any other model) is a lack of Japanese training data. OpenAI's dataset has 15,914 hours of JP audio (7054 hours with Japanese transcripts, 8860 with English ones), and even that dwarfs the publicly available ones I'm aware of.

Is anyone interested in collaborating on a dataset? My idea is to take tons of subbed Japanese media, do source separation on the audio to isolate the voices, and segment it using the subtitles. Crunchyroll has subbed 14,000+ hours of anime, so it should be possible to build the world's biggest Japanese -> English dataset with anime alone.
This makes sense in the grand schemes of AI in general, they can only work/learn when you feed them.

It's also the reason why I write down Japanese phrases/dialog used frequently in the movies. Almost 90% of the phrases are always the same in JAV.

It's time consuming when you do it with subbing but it does help a lot.

But anime will always have the advantage, its a different median access internationally, more popular, looked at by young and old, and makes lots of money if properly done.
JAV no matter how well acted/produced is porn :D
 

Electromog

Akiba Citizen
Dec 7, 2009
4,666
2,868
The big advantage of anime in this, is that the voices are recorded on a sound stage under perfect conditions, where JAV is recorded whereever they are acting it out, so no microphones right in front of their mouth like on a sound stage. So anime should be much easier to autotranslate than JAV.
 

mei2

Well-Known Member
Dec 6, 2018
247
407
A big hurdle towards fine-tuning Whisper (or any other model) is a lack of Japanese training data. OpenAI's dataset has 15,914 hours of JP audio (7054 hours with Japanese transcripts, 8860 with English ones), and even that dwarfs the publicly available ones I'm aware of.

Is anyone interested in collaborating on a dataset? My idea is to take tons of subbed Japanese media, do source separation on the audio to isolate the voices, and segment it using the subtitles. Crunchyroll has subbed 14,000+ hours of anime, so it should be possible to build the world's biggest Japanese -> English dataset with anime alone.
I'm interested. I had similar thinking, and I started wondering if one can use Japanese caption of Japanese movies -- to be able to create phonetics data set. I.e. not a translation data set, but two steps before that. I started looking at Netflix' naked director as a source --there are Japanese captions available. Just a thought.
 

SUNBO

Active Member
Nov 19, 2007
115
87
So OpenAI just released Whisper, their speech to text AI, and the transcription seems pretty decent from what I tested using Google Colab.

View attachment 3045996

It's also relatively simple since it's only one command to do a Japanese -> English transcription.

Here's the Github repo for it: https://github.com/openai/whisper

Here's the Colab notebook for it, just replace the file_location variable with a link to the audio file you want to transcribe: https://colab.research.google.com/drive/1j3-_EF43nUCeIkrzpk_jpamtFZmURYrU?usp=sharing
Did you make the code for that colab? Do you know why sometimes I get this error?
Code:
Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/whisper/audio.py", line 44, in load_audio .run(cmd="ffmpeg", capture_stdout=True, capture_stderr=True) File "/usr/local/lib/python3.7/dist-packages/ffmpeg/_run.py", line 325, in run raise Error('ffmpeg', out, err) ffmpeg._run.Error: ffmpeg error (see stderr output for detail) The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/local/bin/whisper", line 8, in <module> sys.exit(cli()) File "/usr/local/lib/python3.7/dist-packages/whisper/transcribe.py", line 300, in cli result = transcribe(model, audio_path, temperature=temperature, **args) File "/usr/local/lib/python3.7/dist-packages/whisper/transcribe.py", line 84, in transcribe mel = log_mel_spectrogram(audio) File "/usr/local/lib/python3.7/dist-packages/whisper/audio.py", line 111, in log_mel_spectrogram audio = load_audio(audio) File "/usr/local/lib/python3.7/dist-packages/whisper/audio.py", line 47, in load_audio raise RuntimeError(f"Failed to load audio: {e.stderr.decode()}") from e RuntimeError: Failed to load audio: ffmpeg version 3.4.11-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04) configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared libavutil 55. 78.100 / 55. 78.100 libavcodec 57.107.100 / 57.107.100 libavformat 57. 83.100 / 57. 83.100 libavdevice 57. 10.100 / 57. 10.100 libavfilter 6.107.100 / 6.107.100 libavresample 3. 7. 0 / 3. 7. 0 libswscale 4. 8.100 / 4. 8.100 libswresample 2. 9.100 / 2. 9.100 libpostproc 54. 7.100 / 54. 7.100 https://files.catbox.moe/ofg7ii.mp3: Connection timed out
 

Non_Entity

New Member
Sep 26, 2022
6
13
I've written a notebook that combines Whisper with a separate VAD. It works much better than Whisper alone on long-form inputs, and also runs about 2-4x faster.

It's still far from perfect, though. There's a tendency to translate silence as "Thank you for watching!", "Please subscribe to my channel!" and so on, probably because it was trained on YouTube captions. I could remove some of those with a word filter, but without fine-tuning Whisper, it's impossible to get rid of all of them. The notebook also supports DeepL, with Whisper being used as speech-to-text.
 

SUNBO

Active Member
Nov 19, 2007
115
87
I've written a notebook that combines Whisper with a separate VAD. It works much better than Whisper alone on long-form inputs, and also runs about 2-4x faster.

It's still far from perfect, though. There's a tendency to translate silence as "Thank you for watching!", "Please subscribe to my channel!" and so on, probably because it was trained on YouTube captions. I could remove some of those with a word filter, but without fine-tuning Whisper, it's impossible to get rid of all of them. The notebook also supports DeepL, with Whisper being used as speech-to-text.
Hey this is indeed better than whisper! It produces the best translation so far! It's almost human-like! It's almost perfect. There is one problem though, it stops translating after 12min, it says "Please subscribe to my channel" then nothing after that. Are you able to fix that issue?

Also it gives error when using url for audio, but works fine if i upload to the colab.
 

tangerinefeline

New Member
Sep 22, 2022
3
8
Did you make the code for that colab? Do you know why sometimes I get this error?
Code:
Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/whisper/audio.py", line 44, in load_audio .run(cmd="ffmpeg", capture_stdout=True, capture_stderr=True) File "/usr/local/lib/python3.7/dist-packages/ffmpeg/_run.py", line 325, in run raise Error('ffmpeg', out, err) ffmpeg._run.Error: ffmpeg error (see stderr output for detail) The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/local/bin/whisper", line 8, in <module> sys.exit(cli()) File "/usr/local/lib/python3.7/dist-packages/whisper/transcribe.py", line 300, in cli result = transcribe(model, audio_path, temperature=temperature, **args) File "/usr/local/lib/python3.7/dist-packages/whisper/transcribe.py", line 84, in transcribe mel = log_mel_spectrogram(audio) File "/usr/local/lib/python3.7/dist-packages/whisper/audio.py", line 111, in log_mel_spectrogram audio = load_audio(audio) File "/usr/local/lib/python3.7/dist-packages/whisper/audio.py", line 47, in load_audio raise RuntimeError(f"Failed to load audio: {e.stderr.decode()}") from e RuntimeError: Failed to load audio: ffmpeg version 3.4.11-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04) configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared libavutil 55. 78.100 / 55. 78.100 libavcodec 57.107.100 / 57.107.100 libavformat 57. 83.100 / 57. 83.100 libavdevice 57. 10.100 / 57. 10.100 libavfilter 6.107.100 / 6.107.100 libavresample 3. 7. 0 / 3. 7. 0 libswscale 4. 8.100 / 4. 8.100 libswresample 2. 9.100 / 2. 9.100 libpostproc 54. 7.100 / 54. 7.100 https://files.catbox.moe/ofg7ii.mp3: Connection timed out

Hmm sometimes Google colab is unable to connect to catbox based on my own experiences and that seems to be the issue there. You might want to try another filehost such as https://pomf.lain.la/

I see that @Non_Entity has published a new notebook that seems to be much better so that would probably be the direction to go!
 
  • Like
Reactions: SUNBO and mei2

Non_Entity

New Member
Sep 26, 2022
6
13
I've updated the notebook. It should be much less likely to output things like "Please subscribe" now.

There is one problem though, it stops translating after 12min
For the rest of the video, or just one scene? You could try lowering the chunk_threshold to 2.0 or 1.0 if that happens.
 
  • Like
Reactions: tangerinefeline

SUNBO

Active Member
Nov 19, 2007
115
87
I've updated the notebook. It should be much less likely to output things like "Please subscribe" now.


For the rest of the video, or just one scene? You could try lowering the chunk_threshold to 2.0 or 1.0 if that happens.
Rest of the video, I checked the subtitle file, it just nothing after that.

What is chunk threshold?
 

Non_Entity

New Member
Sep 26, 2022
6
13
Rest of the video, I checked the subtitle file, it just nothing after that.

What is chunk threshold?
It's how many seconds the VAD waits before splitting the audio. Each chunk goes through Whisper separately, which prevents it from getting stuck on one line for the whole video.
 

avatarthe

Well-Known Member
Feb 1, 2008
184
282
View attachment 3049242

I decided to test VOSK vs Pytranscriber on the opening of MIAA-698 -[Single Mom Reserve Army] The Hurdles To SEX Are Too Low, Can't Stand My Sweaty Eldest Daughter Naked Every Day! Lima Arai (2022) as the opening is a simple monologue from the father with no music... plus Lima Arai is Sooo hot in this film as the daughter with hyperhydrosis (Over active sweat gland) who is always so hot and sweaty that she never wear cloths (well she wears sock and the tie for her school uniform) that it real deserves subtitles.

I think Pytransriber did a bettre job, what has other peoples experience bee?



fanart.jpg


LETS TRY THIS AGAIN....

I decided to test VOSK vs Pytranscriber vs Whisper (with VAD) on the opening of MIAA-698 -[Single Mom Reserve Army] The Hurdles To SEX Are Too Low, Can't Stand My Sweaty Eldest Daughter Naked Every Day! Lima Arai (2022) as the opening is a simple monologue from the father with no music... plus Lima Arai is Sooo hot in this film as the daughter with hyperhydrosis (Over active sweat gland) who is always so hot and sweaty that she never wear cloths (well she wears sock and the tie for her school uniform) that it real deserves subtitles.

I think Pytransriber and Whisper did the best jobs , what has other peoples experience been?









Whisper w/ VAD & DeeplPytranscriberVOSK
1
00:00:00,826 --> 00:00:05,946
Hello, there. An old man came out of nowhere.

2
00:00:05,946 --> 00:00:13,186
I know some of you may be surprised to see an old man out of nowhere, but please bear with me for a few minutes.

3
00:00:13,186 --> 00:00:17,394
This time, I'd like to share with you...

4
00:00:18,194 --> 00:00:23,034
I'm sorry to say it myself, but...

5
00:00:23,034 --> 00:00:26,874
My family is a little bit different.

6
00:00:26,874 --> 00:00:32,890
He introduced himself to me.

7
00:00:32,890 --> 00:00:38,730
I'm the mainstay of Shin-ike. I'm my father's son.

8
00:00:38,730 --> 00:00:44,650
As you can see, I'm not a magazine model.

9
00:00:44,650 --> 00:00:52,010
I'm a regular, ungrateful businessman, and I don't break my teeth in Shinpike.

1
00:00:00,256 --> 00:00:06,400
Hello. I'm surprised to see an old man out of nowhere.

2
00:00:06,656 --> 00:00:12,800
Please take a few minutes of your time.

3
00:00:13,056 --> 00:00:19,200
This time, I'd like to share with you a little bit about myself.

4
00:00:19,456 --> 00:00:25,600
I'd like to share with you my story.

5
00:00:25,856 --> 00:00:32,000
I'm sorry I'm late to introduce myself.

6
00:00:32,256 --> 00:00:38,400
I'm a little rough around the edges, but I'm the mainstay of my family. I'm right after my father.

7
00:00:38,656 --> 00:00:44,800
As you can see, I'm not a magazine model.

8
00:00:45,056 --> 00:00:51,200
I'm a regular office worker. There's no Araike here.
1
00:00:01,200 --> 00:00:02,400
Hi there.

2
00:00:04,110 --> 00:00:05,520
I'm sure some of you are surprised to see your uncle out of nowhere.

3
00:00:06,180 --> 00:00:08,088
I know some of you were surprised.

4
00:00:09,180 --> 00:00:10,180
It's just for a few hours.

5
00:00:10,680 --> 00:00:11,680
Please bear with me.

6
00:00:13,410 --> 00:00:14,410
This time

7
00:00:14,614 --> 00:00:15,614
I'd like to send you

8
00:00:18,090 --> 00:00:19,680
I don't know if I should say this myself, but...

9
00:00:20,401 --> 00:00:21,401
I'm a little...

10
00:00:23,250 --> 00:00:24,270
I've changed a lot.

11
00:00:25,470 --> 00:00:26,470
My family

12
00:00:27,090 --> 00:00:28,620
It's a story about a coarse house.

13
00:00:30,660 --> 00:00:31,950
I'm late to introduce myself.

14
00:00:33,090 --> 00:00:34,090
I am

15
00:00:34,140 --> 00:00:35,504
The mainstay of our coarse family.

16
00:00:36,240 --> 00:00:37,800
I'm Taku, my father.

17
00:00:38,910 --> 00:00:40,110
As you can see...

18
00:00:40,560 --> 00:00:41,730
I'm not a magazine model

19
00:00:42,270 --> 00:00:43,270
I'm not

20
00:00:45,360 --> 00:00:46,830
I'm just an ordinary office worker

21
00:00:46,830 --> 00:00:47,830
I'm an ordinary businessman.

22
00:00:48,840 --> 00:00:49,840
Washing machine

23
00:00:50,268 --> 00:00:51,268
I don't have a washing machine

24
00:00:52,321 --> 00:00:53,321
I'm not a washer.
10
00:00:52,010 --> 00:00:55,610
Because eight years ago, my wife left me for a young man she worked with part-time.

11
00:00:55,610 --> 00:01:04,090
She left me for a young man she was working with.

12
00:01:04,090 --> 00:01:08,850
Since then, I've been the mainstay of our family.

13
00:01:08,850 --> 00:01:14,770
I've raised three children.

14
00:01:14,770 --> 00:01:21,450
I'd like to introduce you to my lovely children. Let's start with my son.

15
00:01:21,450 --> 00:01:28,410
Daisuke, my eldest son, is a college student.

16
00:01:28,410 --> 00:01:33,690
He's more solid than I am.

17
00:01:33,690 --> 00:01:37,970
He's into muscle training. He's been working out a lot lately.

18
00:01:37,970 --> 00:01:43,338
Maybe he has too much power.

19
00:01:43,338 --> 00:01:46,418
Maybe he has too much power.
9
00:00:51,456 --> 00:00:57,600
Because eight years ago, my wife left me for a young man she worked with part-time.

10
00:00:57,856 --> 00:01:04,000
She got tired of him and left him.

11
00:01:04,256 --> 00:01:05,536
Since then, I've been the mainstay of Araike for three years.

12
00:01:05,792 --> 00:01:11,680
I've been the mainstay of Araike's household, raising our three children.

13
00:01:11,936 --> 00:01:14,496
Now then...

14
00:01:15,008 --> 00:01:17,824
I would like to introduce you to my lovely children.

15
00:01:18,336 --> 00:01:24,480
First of all, my son, my eldest, is my favorite, a college student.

16
00:01:24,736 --> 00:01:30,880
He is a serious boy, and I guess he is more solid than I am.

17
00:01:31,136 --> 00:01:33,184
Lately, he's been...

18
00:01:33,440 --> 00:01:35,232
Muscle training, Yamanote...

19
00:01:35,744 --> 00:01:37,536
A typhoon is coming.

20
00:01:37,792 --> 00:01:43,936
Maybe he's got too much power.

25
00:00:53,970 --> 00:00:54,180
8 years ago

26
00:00:54,210 --> 00:00:55,290
years ago, my wife

27
00:00:55,800 --> 00:00:56,070
part time job

28
00:00:56,070 --> 00:00:58,193
I've been working with a lot of young men.

29
00:00:59,670 --> 00:01:00,300
She got tired of me.

30
00:01:00,661 --> 00:01:01,661
She got tired of me.

31
00:01:01,860 --> 00:01:02,860
She left me.

32
00:01:04,140 --> 00:01:05,250
And then I went back to

33
00:01:06,030 --> 00:01:06,750
As the mainstay

34
00:01:06,990 --> 00:01:08,100
As the mainstay

35
00:01:09,120 --> 00:01:09,210
Three...

36
00:01:09,229 --> 00:01:10,229
I've raised three children.

37
00:01:10,410 --> 00:01:11,410
I've raised three children.

38
00:01:13,530 --> 00:01:14,530
And now...

39
00:01:15,000 --> 00:01:17,730
I would like to introduce you to my lovely children.

40
00:01:18,570 --> 00:01:19,920
Let's start from there.

41
00:01:21,630 --> 00:01:22,830
My eldest son Daisuke.

42
00:01:23,340 --> 00:01:24,340
He's a college student, isn't he?

43
00:01:26,100 --> 00:01:27,630
He's very serious.

44
00:01:28,650 --> 00:01:31,200
I guess he's more solid than I am.

45
00:01:31,950 --> 00:01:33,060
What is he doing these days?

46
00:01:33,870 --> 00:01:34,348
muscle training

47
00:01:34,348 --> 00:01:35,348
She's into muscle training.

48
00:01:35,730 --> 00:01:37,440
She seems to be working out a lot.

49
00:01:38,220 --> 00:01:38,591
Maybe he has too much power

50
00:01:38,591 --> 00:01:39,990
Maybe he has too much power.

51
00:01:43,444 --> 00:01:44,640
It might be amazing.

20
00:01:46,418 --> 00:01:52,738
Next is Shin-ike, my second and youngest son.

21
00:01:52,738 --> 00:01:59,634
He doesn't go to school.

22
00:02:02,162 --> 00:02:04,962
He's always doing anime and mechanical engineering.

23
00:02:04,962 --> 00:02:08,362
He's a recluse.

24
00:02:08,362 --> 00:02:13,042
I'm a recluse.

25
00:02:13,042 --> 00:02:17,602
I don't get to see him much.

26
00:02:17,602 --> 00:02:23,642
I don't get to see him much, so sometimes I don't know what he's thinking.

21
00:01:44,960 --> 00:01:51,104
Next is Shinji, my second and youngest son.

22
00:01:53,920 --> 00:02:00,064
He doesn't go to school.

23
00:02:00,320 --> 00:02:06,464
Animated? He's always doing that.

24
00:02:06,720 --> 00:02:12,864
A recluse. Me too.

25
00:02:13,120 --> 00:02:17,472
I don't get to see him much.

26
00:02:17,728 --> 00:02:19,520
I don't get to see him much, so I don't know what he's thinking.

27
00:02:19,776 --> 00:02:24,640
I don't know what he's thinking.
52
00:01:47,370 --> 00:01:48,370
Next...

53
00:01:49,230 --> 00:01:50,230
Second son

54
00:01:50,640 --> 00:01:50,910
Check

55
00:01:50,910 --> 00:01:51,000
of

56
00:01:51,327 --> 00:01:52,327
Check it out!

57
00:01:53,880 --> 00:01:54,960
This boy

58
00:01:56,520 --> 00:01:57,870
He doesn't go to school.

59
00:02:02,400 --> 00:02:03,400
Anime

60
00:02:03,750 --> 00:02:03,990
Mecha

61
00:02:04,230 --> 00:02:04,500
or...

62
00:02:05,130 --> 00:02:06,540
That's all she does.

63
00:02:09,085 --> 00:02:10,085
That's nice.

64
00:02:11,550 --> 00:02:12,550
Me too.

65
00:02:13,230 --> 00:02:15,420
I don't know what he's thinking.

66
00:02:17,790 --> 00:02:19,530
I don't know what he's thinking.

67
00:02:20,220 --> 00:02:21,420
I don't know what he's thinking.

27
00:02:23,642 --> 00:02:28,722
Finally, there's Noriyuki, my eldest daughter.

28
00:02:28,722 --> 00:02:34,458
Norimo, my eldest daughter between my oldest and second son.

29
00:02:34,458 --> 00:02:38,178
She's the problem child of the new pond.

30
00:02:38,178 --> 00:02:44,138
She never wears clothes. She spends most of her time at home completely naked.

31
00:02:44,138 --> 00:02:49,298
She also can't study.

32
00:02:49,298 --> 00:02:54,394
She's also not good at studying, and she's not in school.

33
00:02:54,394 --> 00:02:58,514
What do you call a "gal"?

34
00:02:58,514 --> 00:03:02,554
I've been an impatient kid since childhood.

35
00:03:02,554 --> 00:03:06,146
Maybe I'm sentimental.

36
00:03:06,146 --> 00:03:09,506
This time, this Rima...

37
00:03:09,506 --> 00:03:15,706
She's causing all kinds of problems.

38
00:03:15,706 --> 00:03:19,706
I was covered in sweat with my eldest daughter.

39
00:03:19,706 --> 00:03:23,866
A slightly unusual memory of a summer in Shin-ike

40
00:03:23,866 --> 00:03:30,054
Please take a look at 2022
28
00:02:25,152 --> 00:02:31,296
I'm Rima, the eldest between two sons.

29
00:02:31,552 --> 00:02:37,696
She's the problem child of Araike.

30
00:02:37,952 --> 00:02:44,096
She never wears clothes and spends most of her time at home completely naked.

31
00:02:44,352 --> 00:02:50,496
She can't even study... and she's not in school.

32
00:02:54,592 --> 00:03:00,736
She's a gal? Also, for some reason, she's always sweaty since she was a kid.

33
00:03:02,528 --> 00:03:05,344
Is she expensive?

34
00:03:05,856 --> 00:03:12,000
This time, Lima is causing all kinds of trouble.

35
00:03:12,256 --> 00:03:18,400
Let's see... naked and covered in sweat with my eldest daughter.

36
00:03:18,656 --> 00:03:20,960
A little strange.

37
00:03:21,216 --> 00:03:23,520
Memories of Summer in Araike

38
00:03:23,776 --> 00:03:29,920
Please take a look at 2022.

68
00:02:23,880 --> 00:02:24,880
Finally.

69
00:02:25,800 --> 00:02:26,800
The first son

70
00:02:27,180 --> 00:02:28,180
The second son

71
00:02:28,890 --> 00:02:29,890
The eldest daughter.

72
00:02:29,945 --> 00:02:30,945
Now.

73
00:02:32,790 --> 00:02:34,020
This is Koga.

74
00:02:34,650 --> 00:02:36,210
She's the problem child of the family.

75
00:02:36,990 --> 00:02:37,990
First of all...

76
00:02:38,220 --> 00:02:38,640
Clothes.

77
00:02:38,790 --> 00:02:39,790
No clothes.

78
00:02:39,960 --> 00:02:42,480
She spends most of her time at home completely naked.

79
00:02:44,400 --> 00:02:45,400
Also...

80
00:02:45,810 --> 00:02:46,810
I don't know how to say it.

81
00:02:47,430 --> 00:02:48,690
I can't even study.

82
00:02:49,500 --> 00:02:51,390
I didn't finish school.

83
00:02:53,283 --> 00:02:54,283
I'm what's called

84
00:02:54,660 --> 00:02:54,960
Gyaru

85
00:02:54,960 --> 00:02:55,960
I guess you could say

86
00:02:56,790 --> 00:02:57,840
I don't know why.

87
00:02:58,710 --> 00:03:01,200
I've come all this way since I was a kid.

88
00:03:02,550 --> 00:03:04,290
I wonder if she has hyperhidrosis.

89
00:03:06,120 --> 00:03:07,120
This time

90
00:03:07,530 --> 00:03:07,830
This

91
00:03:08,040 --> 00:03:09,040
is

92
00:03:10,800 --> 00:03:12,030
It's going to be a problem.

93
00:03:14,010 --> 00:03:15,207
Well then...

94
00:03:15,750 --> 00:03:16,890
Naked eldest daughter and

95
00:03:18,089 --> 00:03:19,089
covered in

96
00:03:19,800 --> 00:03:20,940
A little strange.

97
00:03:21,570 --> 00:03:23,190
Memories of summer in a rough house

98
00:03:23,990 --> 00:03:24,990
round
 
  • Like
Reactions: mei2 and SUNBO

SUNBO

Active Member
Nov 19, 2007
115
87
View attachment 3052930


LETS TRY THIS AGAIN....

I decided to test VOSK vs Pytranscriber vs Whisper (with VAD) on the opening of MIAA-698 -[Single Mom Reserve Army] The Hurdles To SEX Are Too Low, Can't Stand My Sweaty Eldest Daughter Naked Every Day! Lima Arai (2022) as the opening is a simple monologue from the father with no music... plus Lima Arai is Sooo hot in this film as the daughter with hyperhydrosis (Over active sweat gland) who is always so hot and sweaty that she never wear cloths (well she wears sock and the tie for her school uniform) that it real deserves subtitles.

I think Pytransriber and Whisper did the best jobs , what has other peoples experience been?









Whisper w/ VAD & DeeplPytranscriberVOSK
1
00:00:00,826 --> 00:00:05,946
Hello, there. An old man came out of nowhere.

2
00:00:05,946 --> 00:00:13,186
I know some of you may be surprised to see an old man out of nowhere, but please bear with me for a few minutes.

3
00:00:13,186 --> 00:00:17,394
This time, I'd like to share with you...

4
00:00:18,194 --> 00:00:23,034
I'm sorry to say it myself, but...

5
00:00:23,034 --> 00:00:26,874
My family is a little bit different.

6
00:00:26,874 --> 00:00:32,890
He introduced himself to me.

7
00:00:32,890 --> 00:00:38,730
I'm the mainstay of Shin-ike. I'm my father's son.

8
00:00:38,730 --> 00:00:44,650
As you can see, I'm not a magazine model.

9
00:00:44,650 --> 00:00:52,010
I'm a regular, ungrateful businessman, and I don't break my teeth in Shinpike.
1
00:00:00,256 --> 00:00:06,400
Hello. I'm surprised to see an old man out of nowhere.

2
00:00:06,656 --> 00:00:12,800
Please take a few minutes of your time.

3
00:00:13,056 --> 00:00:19,200
This time, I'd like to share with you a little bit about myself.

4
00:00:19,456 --> 00:00:25,600
I'd like to share with you my story.

5
00:00:25,856 --> 00:00:32,000
I'm sorry I'm late to introduce myself.

6
00:00:32,256 --> 00:00:38,400
I'm a little rough around the edges, but I'm the mainstay of my family. I'm right after my father.

7
00:00:38,656 --> 00:00:44,800
As you can see, I'm not a magazine model.

8
00:00:45,056 --> 00:00:51,200
I'm a regular office worker. There's no Araike here.
1
00:00:01,200 --> 00:00:02,400
Hi there.

2
00:00:04,110 --> 00:00:05,520
I'm sure some of you are surprised to see your uncle out of nowhere.

3
00:00:06,180 --> 00:00:08,088
I know some of you were surprised.

4
00:00:09,180 --> 00:00:10,180
It's just for a few hours.

5
00:00:10,680 --> 00:00:11,680
Please bear with me.

6
00:00:13,410 --> 00:00:14,410
This time

7
00:00:14,614 --> 00:00:15,614
I'd like to send you

8
00:00:18,090 --> 00:00:19,680
I don't know if I should say this myself, but...

9
00:00:20,401 --> 00:00:21,401
I'm a little...

10
00:00:23,250 --> 00:00:24,270
I've changed a lot.

11
00:00:25,470 --> 00:00:26,470
My family

12
00:00:27,090 --> 00:00:28,620
It's a story about a coarse house.

13
00:00:30,660 --> 00:00:31,950
I'm late to introduce myself.

14
00:00:33,090 --> 00:00:34,090
I am

15
00:00:34,140 --> 00:00:35,504
The mainstay of our coarse family.

16
00:00:36,240 --> 00:00:37,800
I'm Taku, my father.

17
00:00:38,910 --> 00:00:40,110
As you can see...

18
00:00:40,560 --> 00:00:41,730
I'm not a magazine model

19
00:00:42,270 --> 00:00:43,270
I'm not

20
00:00:45,360 --> 00:00:46,830
I'm just an ordinary office worker

21
00:00:46,830 --> 00:00:47,830
I'm an ordinary businessman.

22
00:00:48,840 --> 00:00:49,840
Washing machine

23
00:00:50,268 --> 00:00:51,268
I don't have a washing machine

24
00:00:52,321 --> 00:00:53,321
I'm not a washer.
10
00:00:52,010 --> 00:00:55,610
Because eight years ago, my wife left me for a young man she worked with part-time.

11
00:00:55,610 --> 00:01:04,090
She left me for a young man she was working with.

12
00:01:04,090 --> 00:01:08,850
Since then, I've been the mainstay of our family.

13
00:01:08,850 --> 00:01:14,770
I've raised three children.

14
00:01:14,770 --> 00:01:21,450
I'd like to introduce you to my lovely children. Let's start with my son.

15
00:01:21,450 --> 00:01:28,410
Daisuke, my eldest son, is a college student.

16
00:01:28,410 --> 00:01:33,690
He's more solid than I am.

17
00:01:33,690 --> 00:01:37,970
He's into muscle training. He's been working out a lot lately.

18
00:01:37,970 --> 00:01:43,338
Maybe he has too much power.

19
00:01:43,338 --> 00:01:46,418
Maybe he has too much power.
9
00:00:51,456 --> 00:00:57,600
Because eight years ago, my wife left me for a young man she worked with part-time.

10
00:00:57,856 --> 00:01:04,000
She got tired of him and left him.

11
00:01:04,256 --> 00:01:05,536
Since then, I've been the mainstay of Araike for three years.

12
00:01:05,792 --> 00:01:11,680
I've been the mainstay of Araike's household, raising our three children.

13
00:01:11,936 --> 00:01:14,496
Now then...

14
00:01:15,008 --> 00:01:17,824
I would like to introduce you to my lovely children.

15
00:01:18,336 --> 00:01:24,480
First of all, my son, my eldest, is my favorite, a college student.

16
00:01:24,736 --> 00:01:30,880
He is a serious boy, and I guess he is more solid than I am.

17
00:01:31,136 --> 00:01:33,184
Lately, he's been...

18
00:01:33,440 --> 00:01:35,232
Muscle training, Yamanote...

19
00:01:35,744 --> 00:01:37,536
A typhoon is coming.

20
00:01:37,792 --> 00:01:43,936
Maybe he's got too much power.
25
00:00:53,970 --> 00:00:54,180
8 years ago

26
00:00:54,210 --> 00:00:55,290
years ago, my wife

27
00:00:55,800 --> 00:00:56,070
part time job

28
00:00:56,070 --> 00:00:58,193
I've been working with a lot of young men.

29
00:00:59,670 --> 00:01:00,300
She got tired of me.

30
00:01:00,661 --> 00:01:01,661
She got tired of me.

31
00:01:01,860 --> 00:01:02,860
She left me.

32
00:01:04,140 --> 00:01:05,250
And then I went back to

33
00:01:06,030 --> 00:01:06,750
As the mainstay

34
00:01:06,990 --> 00:01:08,100
As the mainstay

35
00:01:09,120 --> 00:01:09,210
Three...

36
00:01:09,229 --> 00:01:10,229
I've raised three children.

37
00:01:10,410 --> 00:01:11,410
I've raised three children.

38
00:01:13,530 --> 00:01:14,530
And now...

39
00:01:15,000 --> 00:01:17,730
I would like to introduce you to my lovely children.

40
00:01:18,570 --> 00:01:19,920
Let's start from there.

41
00:01:21,630 --> 00:01:22,830
My eldest son Daisuke.

42
00:01:23,340 --> 00:01:24,340
He's a college student, isn't he?

43
00:01:26,100 --> 00:01:27,630
He's very serious.

44
00:01:28,650 --> 00:01:31,200
I guess he's more solid than I am.

45
00:01:31,950 --> 00:01:33,060
What is he doing these days?

46
00:01:33,870 --> 00:01:34,348
muscle training

47
00:01:34,348 --> 00:01:35,348
She's into muscle training.

48
00:01:35,730 --> 00:01:37,440
She seems to be working out a lot.

49
00:01:38,220 --> 00:01:38,591
Maybe he has too much power

50
00:01:38,591 --> 00:01:39,990
Maybe he has too much power.

51
00:01:43,444 --> 00:01:44,640
It might be amazing.
20
00:01:46,418 --> 00:01:52,738
Next is Shin-ike, my second and youngest son.

21
00:01:52,738 --> 00:01:59,634
He doesn't go to school.

22
00:02:02,162 --> 00:02:04,962
He's always doing anime and mechanical engineering.

23
00:02:04,962 --> 00:02:08,362
He's a recluse.

24
00:02:08,362 --> 00:02:13,042
I'm a recluse.

25
00:02:13,042 --> 00:02:17,602
I don't get to see him much.

26
00:02:17,602 --> 00:02:23,642
I don't get to see him much, so sometimes I don't know what he's thinking.
21
00:01:44,960 --> 00:01:51,104
Next is Shinji, my second and youngest son.

22
00:01:53,920 --> 00:02:00,064
He doesn't go to school.

23
00:02:00,320 --> 00:02:06,464
Animated? He's always doing that.

24
00:02:06,720 --> 00:02:12,864
A recluse. Me too.

25
00:02:13,120 --> 00:02:17,472
I don't get to see him much.

26
00:02:17,728 --> 00:02:19,520
I don't get to see him much, so I don't know what he's thinking.

27
00:02:19,776 --> 00:02:24,640
I don't know what he's thinking.
52
00:01:47,370 --> 00:01:48,370
Next...

53
00:01:49,230 --> 00:01:50,230
Second son

54
00:01:50,640 --> 00:01:50,910
Check

55
00:01:50,910 --> 00:01:51,000
of

56
00:01:51,327 --> 00:01:52,327
Check it out!

57
00:01:53,880 --> 00:01:54,960
This boy

58
00:01:56,520 --> 00:01:57,870
He doesn't go to school.

59
00:02:02,400 --> 00:02:03,400
Anime

60
00:02:03,750 --> 00:02:03,990
Mecha

61
00:02:04,230 --> 00:02:04,500
or...

62
00:02:05,130 --> 00:02:06,540
That's all she does.

63
00:02:09,085 --> 00:02:10,085
That's nice.

64
00:02:11,550 --> 00:02:12,550
Me too.

65
00:02:13,230 --> 00:02:15,420
I don't know what he's thinking.

66
00:02:17,790 --> 00:02:19,530
I don't know what he's thinking.

67
00:02:20,220 --> 00:02:21,420
I don't know what he's thinking.
27
00:02:23,642 --> 00:02:28,722
Finally, there's Noriyuki, my eldest daughter.

28
00:02:28,722 --> 00:02:34,458
Norimo, my eldest daughter between my oldest and second son.

29
00:02:34,458 --> 00:02:38,178
She's the problem child of the new pond.

30
00:02:38,178 --> 00:02:44,138
She never wears clothes. She spends most of her time at home completely naked.

31
00:02:44,138 --> 00:02:49,298
She also can't study.

32
00:02:49,298 --> 00:02:54,394
She's also not good at studying, and she's not in school.

33
00:02:54,394 --> 00:02:58,514
What do you call a "gal"?

34
00:02:58,514 --> 00:03:02,554
I've been an impatient kid since childhood.

35
00:03:02,554 --> 00:03:06,146
Maybe I'm sentimental.

36
00:03:06,146 --> 00:03:09,506
This time, this Rima...

37
00:03:09,506 --> 00:03:15,706
She's causing all kinds of problems.

38
00:03:15,706 --> 00:03:19,706
I was covered in sweat with my eldest daughter.

39
00:03:19,706 --> 00:03:23,866
A slightly unusual memory of a summer in Shin-ike

40
00:03:23,866 --> 00:03:30,054
Please take a look at 2022
28
00:02:25,152 --> 00:02:31,296
I'm Rima, the eldest between two sons.

29
00:02:31,552 --> 00:02:37,696
She's the problem child of Araike.

30
00:02:37,952 --> 00:02:44,096
She never wears clothes and spends most of her time at home completely naked.

31
00:02:44,352 --> 00:02:50,496
She can't even study... and she's not in school.

32
00:02:54,592 --> 00:03:00,736
She's a gal? Also, for some reason, she's always sweaty since she was a kid.

33
00:03:02,528 --> 00:03:05,344
Is she expensive?

34
00:03:05,856 --> 00:03:12,000
This time, Lima is causing all kinds of trouble.

35
00:03:12,256 --> 00:03:18,400
Let's see... naked and covered in sweat with my eldest daughter.

36
00:03:18,656 --> 00:03:20,960
A little strange.

37
00:03:21,216 --> 00:03:23,520
Memories of Summer in Araike

38
00:03:23,776 --> 00:03:29,920
Please take a look at 2022.
68
00:02:23,880 --> 00:02:24,880
Finally.

69
00:02:25,800 --> 00:02:26,800
The first son

70
00:02:27,180 --> 00:02:28,180
The second son

71
00:02:28,890 --> 00:02:29,890
The eldest daughter.

72
00:02:29,945 --> 00:02:30,945
Now.

73
00:02:32,790 --> 00:02:34,020
This is Koga.

74
00:02:34,650 --> 00:02:36,210
She's the problem child of the family.

75
00:02:36,990 --> 00:02:37,990
First of all...

76
00:02:38,220 --> 00:02:38,640
Clothes.

77
00:02:38,790 --> 00:02:39,790
No clothes.

78
00:02:39,960 --> 00:02:42,480
She spends most of her time at home completely naked.

79
00:02:44,400 --> 00:02:45,400
Also...

80
00:02:45,810 --> 00:02:46,810
I don't know how to say it.

81
00:02:47,430 --> 00:02:48,690
I can't even study.

82
00:02:49,500 --> 00:02:51,390
I didn't finish school.

83
00:02:53,283 --> 00:02:54,283
I'm what's called

84
00:02:54,660 --> 00:02:54,960
Gyaru

85
00:02:54,960 --> 00:02:55,960
I guess you could say

86
00:02:56,790 --> 00:02:57,840
I don't know why.

87
00:02:58,710 --> 00:03:01,200
I've come all this way since I was a kid.

88
00:03:02,550 --> 00:03:04,290
I wonder if she has hyperhidrosis.

89
00:03:06,120 --> 00:03:07,120
This time

90
00:03:07,530 --> 00:03:07,830
This

91
00:03:08,040 --> 00:03:09,040
is

92
00:03:10,800 --> 00:03:12,030
It's going to be a problem.

93
00:03:14,010 --> 00:03:15,207
Well then...

94
00:03:15,750 --> 00:03:16,890
Naked eldest daughter and

95
00:03:18,089 --> 00:03:19,089
covered in

96
00:03:19,800 --> 00:03:20,940
A little strange.

97
00:03:21,570 --> 00:03:23,190
Memories of summer in a rough house

98
00:03:23,990 --> 00:03:24,990
round
You're right Whisper VAD and PyTranscriber is better than VOSK, they are pretty close too. The best approach although the most tedious is too use all 3 and reference each other and edit out the bad translations with the good from each other.

I did some tests too and comparing with Whisper VAD, using the first 20min of DVDMS-184. I used chunk3, chunk2, chunk1, and run them each 3 times. Chunk 3 and 2 seems to be slightly better with translation.

Issues:
1. Some parts of the video where the voice isn't detected properly, the AI makes up the translation, and it can be VERY DIFFERENT for each run.
2. Still get the "Thank you for watching!" sometime.
3. Sometimes get random chinese/korean single character appearing in English word. But fairly rare.


If I could understand Japanese, I could really assess the translation better. But I don't understand any Japanese besides yamete, hazukashi, iku iku.

I have an idea, I'm gonna try this later with a JAV that is already properly translated, then I can compare which chunk size is better for Whisper VAD. I don't have japanese VPN for pyTranscriber.
 
  • Like
Reactions: mei2

KingofBugs

Active Member
Aug 31, 2022
64
101
Question for some more experienced subbers. I wanted to clean up some subtitles I have collected over the years and then post them here. Is there a program y'all use to streamline this or just open the srt file in notepad and edit it there?
 
  • Like
Reactions: SUNBO