[Reducing Mosaic]RCT-746 Target Wary Mom Chibikko Sexual Harassment Molester Corps Bathing Hen Request SP

Thanks, it works nicely on my i7 with 4060t ti GPU. I ended going with large-V2. Takes anywhere bt 4 to 8 mins, Here's any example:edited: On the latest version, the standalone .7z file comes with a One Click Transcribe.bat file. Now you can drag and drop file or files or folder onto this shortcut to do automatic one click transcriptions. So you can skip steps 3,4,5 below.
If you're on Windows, here is a standalone version:
- Go to Faster Whisper XXL Releases and download the latest Windows version.
- Extract the .7z file and navigate to the folder: Faster-Whisper-XXL.
- Place your movie file inside this folder.
- In the folder, type cmd in the address bar and press Enter.
In the command prompt, type the following command:
faster-whisper-xxl.exe "C:\yourmediafileslocation\abc-123.mp4" --language ja --model tiny --task translate --output_format srt6. This will start the process using your CPU.For NVIDIA GPU Acceleration:
- Download the required CUDA files from CUDA Dependencies for Faster Whisper and extract all contents into the same Faster-Whisper-XXL folder.
The files are:
- cublas64_11
- cublasLt64_11
- cudnn_cnn_infer64_8
- cudnn_ops_infer64_8
- zlibwapi
- Run the same command as above and the process will start with CUDA, If it doesn't, add --device cuda to the command:
faster-whisper-xxl.exe "C:\yourmediafileslocation\abc-123.mp4" --language ja --model tiny --task translate --output_format srt --device cudaAdditional Information:
For more details about arguments and options, type the following in the command prompt:
Faster-Whisper-XXL -hThis should get you started. For more accurate results, you might want to try the medium or large model with a capable GPU.
which colab uses this?The colab uses a separate VAD system to handle the speech detection better than whisper alone and it also splits the audio in many small parts so that helps with hallucination.
For Faster Whisper XXL, you can specify the VAD method to use with the --vad_method option.I have a good GPU (6 gb vram) but im using whisper on my computer using cmd prompt.
e.g. whisper --model large --language ja --task translate "C:\Users\user\Downloads\Video\1ha.mp3"
I want to use the VAD system instead, do you have a script or a tutortial on that?
Seems soo much more accurate. I have a favour to ask, is there anyway you can teach me to set it up or the commands?
This is great. It's a bit off in the highlighted text. To be fair, this part is always problematic. It's slightly better on XXL, but even there, it struggles with getting "he/she" correct. I tested WhisperJav as it is, without any additional settings or using DeepL. Do you mind sharing your settings? Thanks for the site!
These are two vastly different whisper implementations. Whisperjav 0.7 has a lot more clean-up built in, which makes the immediate results seem better but the overall precision is lower due to the use of faster-whisper. Yes, it is faster and more cleaned right out of the colab but quality wise, the slower and slightly older WhisperPro is more precise. In the end, it's just your preference.Seems soo much more accurate. I have a favour to ask, is there anyway you can teach me to set it up or the commands?
Is the WhisperPro you refer to this:These are two vastly different whisper implementations. Whisperjav 0.7 has a lot more clean-up built in, which makes the immediate results seem better but the overall precision is lower due to the use of faster-whisper. Yes, it is faster and more cleaned right out of the colab but quality wise, the slower and slightly older WhisperPro is more precise. In the end, it's just your preference.