Whisper and its many forms

Saqnin · Jan 16, 2025

SamKook said:
I went back to the original whisperwithVAD that didn't force a specific version, and it seems that after installing a few extra packages, it works fine.

I manually added a new line of code to the setup whisper block before running it to install the packages it was saying were missing:

Code:

!pip install ffmpeg-python srt deepl

It should also work to copy the setup from that original, add the line I mentioned and replace the maintenance version or whichever else version code at that step with it, since those have a few quality of life things added to them so they may be preferred.

Where exactly is added this? Please, someone help me, or fix it.

SamKook · Jan 16, 2025

Just press "Show code" for the "Setup Whisper" part and enter it in there:

... looking at that screenshot, they were already theoretically installed, I guess the whisper install messes them up so the original line with them could just be moved to the end instead.

Edit: It seems that spleeter is causing the problem(and preventing the rest of the packages on the line to be installed). Simply deleting spleeter seems to work fine, which makes the code look like this:

Not sure why or if it's important, but I get an srt with just deleting that.

Saqnin · Jan 16, 2025

SamKook said:
Just press "Show code" for the "Setup Whisper" part and enter it in there:
View attachment 3613296

... looking at that screenshot, they were already theoretically installed, I guess the whisper install messes them up so the original line with them could just be moved to the end instead.

Edit: It seems that spleeter is causing the problem(and preventing the rest of the packages on the line to be installed). Simply deleting spleeter seems to work fine, which makes the code look like this:
View attachment 3613331
Not sure why or if it's important, but I get an srt with just deleting that.

Okay, it worked out! Thanks! This is old, the result is not like the latest changes, synchronisation problems, and so on, but it's still something until it's fixed.

mei2 · Jan 16, 2025

I have updated WhisperWithVAD_PRO. It should work now.
Let me know if any errors.

DocNic · Jan 16, 2025

SamKook said:
Seems the version of torch that was forced(1.12.1) isn't available anymore.

Edit: See this post for a temporary fix: https://www.akiba-online.com/threads/whisper-and-its-many-forms.2142559/post-4870014

Hi Sam, I tried deleting the spleeter and that didn't work, neither did adding the line you provided.

I could be doing something wrong, but I've tried, deleting the word "spleeter", and then deleting the line that "spleeter" is on, and then adding the line you quoted, none of that worked.

SamKook · Jan 16, 2025

mei2 fixed the VADpro one so just use that instead.

If you don't want to, I'd need to know what error you're getting.

DocNic · Jan 16, 2025

Hello mei2 and SamKook, I've used mei2's updated VADpro, I'm getting the following error:

"The error message NameError: name 'source_separation' is not defined indicates that the variable source_separation is being used in your code before it has been assigned a value. Looking at your code, it seems likely that the source_separation variable was intended to be defined in the "Whisper Transcription Parameters" section, but it is missing. This causes the error to occur when the code reaches the line if source_separation: as it doesn't know what source_separation refers to."

I'm uploading the individual audio files and copying them into the audio path.

I really appreciate all the help!

SamKook · Jan 16, 2025

That means it doesn't see the variables declared in the first block "Whisper Transcription Parameters" so just make sure you execute it. Try executing it again if you did but still got the error.

If that fails, you can read from here how to get around that particular problem by copying that code block elsewhere: https://www.akiba-online.com/thread...not-a-sub-request-thread.1466451/post-4860978

Saqnin · Jan 16, 2025

On me it difficult to i use VADPro, something is not working, please, Sam to help me, detailed with photos, if possible, Otherwise still, i have to use the old version of VAD...

SamKook · Jan 16, 2025

The only difference is the first block that you need to execute to save the options you choose(you can also just leave it all on default) so press the play icon on that as you do for all the other block of codes.

The rest works the same as the original. If you still have trouble that my answer to DocNic above doesn't fix, then just provide the error message here and we'll see how to fix it.

DocNic · Jan 16, 2025

SamKook said:
That means it doesn't see the variables declared in the first block "Whisper Transcription Parameters" so just make sure you execute it. Try executing it again if you did but still got the error.

If that fails, you can read from here how to get around that particular problem by copying that code block elsewhere: https://www.akiba-online.com/thread...not-a-sub-request-thread.1466451/post-4860978

SamKook you are a God! Thank you so much for the help, it's working now.

Saqnin · Jan 16, 2025

SamKook said:
The only difference is the first block that you need to execute to save the options you choose(you can also just leave it all on default) so press the play icon on that as you do for all the other block of codes.

The rest works the same as the original. If you still have trouble that my answer to DocNic above doesn't fix, then just provide the error message here and we'll see how to fix it.

My problem is uploading the audio file via google drive, somehow it dont work for me, i by making a folder for the same field. Maybe I'm wrong somewhere, but it's not working. Where exactly is this code block put?

SamKook · Jan 16, 2025

Where does it differ when following the tutorial on how to upload using google drive in the first post of this thread? I've only ever used that option to make the tutorial so maybe it changed since or you're running into a problem I didn't.

You don't need to put the code block anywhere, it's just there at the top of the pro colab and you have to make sure to execute it(which I've been guilty of forgetting to do).

Unless you mean on the original colab, then just look at the screenshot on this post: https://www.akiba-online.com/threads/whisper-and-its-many-forms.2142559/post-4870044
That was just some quick debugging late last night and I was posting as I was testing.
What mei2 did on the pro version is the correct thing, to put spleeter on its own line, that way it doesn't get in the way if it fails and if it works, great.

Saqnin · Jan 17, 2025

Works. But why is the process so slow, at 30 min. file, 28 MB?

SamKook · Jan 17, 2025

It has different default settings than the other colab.

If you look at the notes on the bottom:

Code:

Settings for large-v2 higer quality output

(Note 1: it increases the execution time)

So better default for quality but the tradeoff is it's slower.

The test I did took 1h53m for a 3 hour movie leaving it all on default.

Chuckie100 · Jan 17, 2025

SamKook said:
I went back to the original whisperwithVAD that didn't force a specific version, and it seems that after installing a few extra packages, it works fine.

I manually added a new line of code to the setup whisper block before running it to install the packages it was saying were missing:

Code:

!pip install ffmpeg-python srt deepl

It should also work to copy the setup from that original, add the line I mentioned and replace the maintenance version or whichever else version code at that step with it, since those have a few quality of life things added to them so they may be preferred.

Edit: It seems that simply deleting "spleeter" from the code fixes things instead of adding the line since it prevents those 3 from being installed.

I tried deleting "spleeter" from the code before I ran it, but still no joy. Do you have the original setup code?

SamKook · Jan 17, 2025

To be clear, the deleting spleeter fix is for the original WhisperWithVAD by ANonEntity and not the maintenance release by mei2 which people have been using for a long time by now. That maintenance release has an additional issue of forcing a too old torch version so it needs additional modifications.

Since mei2 has fixed the pro version, you can simply use that or copy the "Setup Whisper" code from there and then replace any of the other version "Setup Whisper" code with it to fix them.

Don't forget to press the play icon after you do a modification to run the code again or it will have no effect.

If you don't want to copy the code, you can also fix them directly these ways:

You will see the errors when executing the "Run Whisper" code block but all modifications need to happen in the "Setup Whisper" code block.

To fix a missing "ffmpeg", "srt" or "deepl" module error when executing the last block, put spleeter on its own line(applies to both the original and maintenance release):
Meaning transform this line of the "Setup Whisper" code block

Code:

!pip install deepl srt ffmpeg-python spleeter

into

Code:

!pip install deepl srt ffmpeg-python
!pip install spleeter

To fix a missing "torch" module error when executing the last block, remove the specific version for torch, triton and tiktoken(applies to the maintenance release).
Meaning transform this line

Code:

!pip install --no-cache-dir  torch==1.12.1 torchvision torchaudio torchtext torchdata triton==2.0.0 tiktoken==0.3.3

into this line

Code:

!pip install --no-cache-dir  torch torchvision torchaudio torchtext torchdata triton tiktoken

Electromog · Jan 18, 2025

Regular whisper takes about 15 minutes per hour of video on my computer. How much faster would other versions be and how much worse are the subs they make? I'd like to speed things up a bit but not if it means a big loss of quality.

Edit: This is with the large V2 model, not sure whether some models work better with some versions of whisper or not.

Chuckie100 · Jan 18, 2025

SamKook said:
I went back to the original whisperwithVAD that didn't force a specific version, and it seems that after installing a few extra packages, it works fine.

I manually added a new line of code to the setup whisper block before running it to install the packages it was saying were missing:

Code:

!pip install ffmpeg-python srt deepl

It should also work to copy the setup from that original, add the line I mentioned and replace the maintenance version or whichever else version code at that step with it, since those have a few quality of life things added to them so they may be preferred.

Edit: It seems that simply deleting "spleeter" from the code fixes things instead of adding the line since it prevents those 3 from being installed.

I finally got it to work. Thanks Sam.

SamKook · Jan 18, 2025

Electromog said:
Regular whisper takes about 15 minutes per hour of video on my computer. How much faster would other versions be and how much worse are the subs they make? I'd like to speed things up a bit but not if it means a big loss of quality.

Edit: This is with the large V2 model, not sure whether some models work better with some versions of whisper or not.

To speed things up, I'd look into whispercpp, it should be faster without any quality loss.

With that said, I've never used it so I don't know for sure and I've always found comparing quality to be very difficult with whisper since it never gives the same result twice, even with the same settings.

Whisper and its many forms

New Member

Grand Wizard

New Member

Well-Known Member

Member

Grand Wizard

Member

Grand Wizard

New Member

Grand Wizard

Member

New Member

Grand Wizard

New Member

Grand Wizard

Well-Known Member

Grand Wizard

Akiba Citizen

Well-Known Member

Grand Wizard