Whisper and its many forms

Is anyone else having a problem with the runtime constantly dropping? It usually takes over an hour (used to be 30 min) to do a sub. Sometimes it's so bad that I can't even upload a file, since it keeps losing connection. I'm using VADPro
 
If you're comparing VADPro with the other VADs, then yes, it's normal since it has different defaults which should provide a better result, at the price of taking longer.

If you're comparing to VADPro from some time ago, I don't use it often enough to tell if anything changed.

It's also possible you got a different GPU type since there's a few that can be assigned to you and they would have different performance.
 
Only way I know is to disconnect and then reconnect to a different session and pray. But every time I've used the colab(as a free user, paid may be different), I've only ever got a T4 so I don't know how likely it is that you get different ones.
 
  • Like
Reactions: DocNic
Hello all,

FYI, an updated version of DeepSeek V3 has been released on the website and API.

I used the new version with my go-to test subtitle, DDB-271 transcribed using WhisperWithVAD Pro, and I think the updated version does better translation.

I've attached a zip with two translated and the transcribed SRTs. One translated using the original DeepSeek V3 and one using the updated V3-0324. Neither translated version has any manual editing or cleanup.

Here is the changelog:
 

Attachments

  • Like
Reactions: idolfan and mei2
Hello all, I'm using WhisperWithVAD_PRO and getting the following error:
"<span><span>ValueError:&nbsp;numpy.dtype&nbsp;size&nbsp;changed,&nbsp;may&nbsp;indicate</span><span>&nbsp;binary&nbsp;incompatibility.&nbsp;Expected&nbsp;</span><span>96</span> <span>from</span><span>&nbsp;C&nbsp;header,&nbsp;got&nbsp;</span><span>88</span> <span>from</span><span>&nbsp;PyObject</span></span>"

It feels like the same issue when they changed the version of torch.
 
Hello all, I'm using WhisperWithVAD_PRO and getting the following error:
"ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject"

It feels like the same issue when they changed the version of torch.

As a workaround for now, you can comment out(add a # at the beginning of the line to disable it) the spleeter installation line in the "setup whisper" section(click show code to edit) and it should look like this:
Disable_spleeter.jpg

You need to run the code(press the play button) after you've done this. If you've already ran the code, you'll need to run it again.

Spleeter is used to separate the voice from the background noise.
It's not used by default so unless you set "source_separation" to True(meaning you check the checkbox next to it), it won't change anything(and if you do enable it after disabling the spleeter install, it will give an error).
 
Last edited:
  • Like
Reactions: Novus.Toto
As a workaround for now, you can comment out(add a # at the beginning of the line to disable it) the spleeter installation line in the "setup whisper" section(click show code to edit) and it should look like this:
View attachment 3650116

You need to run the code(press the play button) after you've done this. If you've already ran the code, you'll need to run it again.

Spleeter is used to separate the voice from the background noise.
It's not used by default so unless you set "source_separation" to True(meaning you check the checkbox next to it), it won't change anything(and if you do enable it after disabling the spleeter install, it will give an error).

Upgrading numpy, pandas and tensorflow also seems to resolve the error. Not sure if it's actually necessary to upgrade all of them.

Like so:
1743779437715.png