I don't see VAD (Voice Activity Detection) in that version. It was a hassle to install
https://colab.research.google.com/github/ANonEntity/WhisperWithVAD/blob/main/WhisperWithVAD.ipynb locally but it was worth it. There were so many things to install like CUDA, cudnn, (I think these two are already installed if you have used javplayer before), install
https://pytorch.org, and
Python:
pip install deepl srt ffmpeg-python
pip install git+https://github.com/openai/whisper.git
I have Python 3.9.13 and no compatibility issues. I can't say if Python 3.10 and newer have issues or not with packages like numpy, numba, etc. On my local pc, I changed the code so that it can queue up multiple audio paths. I can leave my pc on when I'm not home and whisper moves on to the next file.
https://drive.google.com/file/d/1z1YX-YVgTZ2LoGDxmcK3BSYnZrWpSLYg/view?usp=share_link. I will explain the changes. Global variables for all the vad settings except audio_path is a parameter. Made a function called use_whisper and indented all the previous code into it. To queue up multiple audio, just copy paste use_whisper("your_path_to_audio_file") at the end (There is commented out code as an example). Also I added more words to the garbage list. Sounds like crackling and crying, and hallucinations such as, It's in my belly, and thank you for dinner gets deleted. To run, type python whisper-vad-function.py