Yes MkvToolNix GUI is your friendIs there a way to take a video file, then make an audio only file from it? I mean, if it won't accept video files, that would be the only solution, right?
Yes MkvToolNix GUI is your friendIs there a way to take a video file, then make an audio only file from it? I mean, if it won't accept video files, that would be the only solution, right?
The difference is not big, likely unnoticeable in most cases. but it could hurt recognition and almost certainly won't help if you convert instead of extract.Is there a significant difference in extracting the mp3 rather than converting when running the audio file to whisper?
A very expensive nvidia gamer gpu(RTX 3080 in my case) is what most people would have that can work well enough for it, but just to do ai stuff, you'd buy a datacenter one like the 3 they mention on the colab page.hahaha no way. I'm so glad it was easy for you.
But my computer is kinda old I do wonder what graphic card will make it run better on install?
Is there a way to take a video file, then make an audio only file from it? I mean, if it won't accept video files, that would be the only solution, right?
From the screenshot, his video is a mp4 so I don't think mkvtoolnix works for that, at least it didn't before, only for mkv.Yes MkvToolNix GUI is your friend
Yes. Mkvtoolnix works for all my need. I literally don't used anything else lolFrom the screenshot, his video is a mp4 so I don't think mkvtoolnix works for that, at least it didn't before, only for mkv.
Multiple people mentionned some so just pick one. i use mp4box with a separate gui but the new version complicates things so not the easiest to install.
Edit: Googling "demux mp4" will get you plenty of options, but here's a tutorial to do it with vlc which you likely already have: https://bartman88.blogspot.com/2018/04/how-to-extract-or-demux-audio-or-video.html
Edit2: Oh, you can do it with mkvtoolnix, nice. Can't put anything inside an mp4 but you can take it out. Well, not quite take it out but put it inside an mka which is a mkv with only audio so should do the trick.
Proper audio for Whisper (no filtering):You gotta be careful since extracting(aka demuxing or demultiplexing) audio isn't the same as converting(aka re-encoding) audio to a specific format. Both will give you an audio file on its own as a result(so you can technically say re-encoding it is extracting it too) so the difference might not be obvious to those unfamiliar with how these things work.
The former doesn't change the audio in any way and is basically instantaneous or takes a couples secs to do, it just takes the audio part out of the file container that holds both video and audio and the latter takes that audio part and completely re-encodes it to a different format or even the same to create an entirely new file and can take a few seconds or minutes depending on your pc.
I think VLC can do both of those things though, but the first one is better since re-encoding means quality loss(unless it's to a lossless format like wav, but those files are huge).
She is indeed a good friend. I have female friends like that. 2 of them are sex buds, but all of them are good. It's funny, because this afternoon I'm treating one to lunch/dinner. We do this always. One buy or treatsHey friends,
I need to make a quick rant.
I was just on SubtitleCat and saw that someone absolutely stole my first 100% original work, BKD-97. That was a true labor of love, as that movie has rocked my particularly pervy boat for years. It was my first baby. I'm especially proud of it as when I watched it with a fellow kinky lady friend of mine who doesn't understand a word of Japanese, she actually thought the subtitles I had created were actually "real". When I told her that I had created every word of dialog, she was floored. (Before anyone asks, she's just a very good friend who also happens to share some of my personal kinks. We don't date. Although she gave me a few blowjobs way back when, we don't play sexually anymore. She's just a friend and a damned good one at that. Although we share some similar kink interests, our friendship has gone light years beyond that to the point where we rarely even discuss kink anymore. But I digress.)
You have the right to be pissed. People can be ass**** no doubt.So, I'm a little bit pissed right now.
I don't get upset when sites (like New-Jav) remove my name from the subtitles (although it's not fun or polite), but this may be the first time I've seen someone take my work, eliminate my name, and put their own name as the author. What a scumbag thing to do. I've talked about this type of thing before, so I won't go further into it. I'll let it go and get back to work. I just wanted to vent to my fellow JAV subtitle fans.
Here is the link to the file that some asshole named DevilGrrl edited to make it look like he/she was the actual author. I have no recourse, so I won't waste any more thought or energy on it. Just needed to vent.
This what I thought. Thank you for confirming it for meA very expensive nvidia gamer gpu(RTX 3080 in my case) is what most people would have that can work well enough for it, but just to do ai stuff, you'd buy a datacenter one like the 3 they mention on the colab page.
There's no benefit to using wav instead of just copying the audio and wav files will be 10-11 times bigger on average so that means if you have to upload it, it'll take that much longer.Proper audio for Whisper (no filtering):
"C:\Program Files\ffmpeg\bin\ffmpeg.exe" -i "X:\SomeJAVMovie.mp4" -ar 16000 "X:\SomeJAVMovie.wav"
ffmpeg -i BBAN-264.mp4 -vn -acodec copy BBAN-264.aac
Code:ffmpeg -i BBAN-264.mp4 -vn -acodec copy BBAN-264.aac
That's not going to do anything, it'll just get ignored and keep the original sampling rate, which for the vast majority of audio is 48k or 44.1k(don't think I've seen a different one than those 2 in a movie or music/audio file unless it's a weird file).@SamKook , I'd suggest to add the sampling rate switch too: ffmpeg -i BBAN-264.mp4 -vn -ar 16000 -acodec copy BBAN-264.aac . In Whisper discussion forum there are few posts about how sensitive the model is to the sampling rate.
# This launches a subprocess to decode audio while down-mixing and resampling as necessary.
# Requires the ffmpeg CLI and `ffmpeg-python` package to be installed.
out, _ = (
ffmpeg.input(file, threads=0)
.output("-", format="s16le", acodec="pcm_s16le", ac=1, ar=sr)
.run(cmd=["ffmpeg", "-nostdin"], capture_stdout=True, capture_stderr=True)
)
And this is why the site appreciates you @SamKook.That's not going to do anything, it'll just get ignored and keep the original sampling rate, which for the vast majority of audio is 48k or 44.1k(don't think I've seen a different one than those 2 in a movie or music/audio file unless it's a weird file).
It's almost certainly not going to be 16k from any audio source so it wouldn't make sense for that to be what's needed, unless whisper is converting it to that internally.
I haven't done any research on this and don't really use whisper or looked at its code so I have no clue about this but it would be a very odd choice to require 16k on the source for best results.
Edit: So, I checked the code and as I suspected, it's changing the sample rate internally to 16k during the decoding phase so the source sample rate doesn't matter.
Here's the code portion that does it. The "sr" value for ar is a variable set to 16000.
Code:# This launches a subprocess to decode audio while down-mixing and resampling as necessary. # Requires the ffmpeg CLI and `ffmpeg-python` package to be installed. out, _ = ( ffmpeg.input(file, threads=0) .output("-", format="s16le", acodec="pcm_s16le", ac=1, ar=sr) .run(cmd=["ffmpeg", "-nostdin"], capture_stdout=True, capture_stderr=True) )
Edit2: Here's the size of the output files for the test movie I used to give you an idea of why resampling it manually is just wasting space most of the time:
Original aac(192kbps): 249.35MB
Uncompressed wav: 1.92GB
Mei2/quay2 16k(stereo): 655.96MB
Whisper 16k(mono): 327.98MB
So unless you start with a not very compressed audio file that's above 256kbps(that's what the whisper processing ends up being), you're just ending up with a bigger file and you want to avoid using an extra lossy compression step to not degrade the audio quality so you have to use something like wav for the output if you process the original.
This is for doing the Whisper part yourself, since online doesn't work anywhere for long audio, and barely works for short audio. And Whisper only accepts 16kHz, so just copying the audio won't work.There's no benefit to using wav instead of just copying the audio and wav files will be 10-11 times bigger on average so that means if you have to upload it, it'll take that much longer.
I'd just do something like this instead after checking what the audio type is with mediainfo to set the extension to the right thing:
Code:ffmpeg -i BBAN-264.mp4 -vn -acodec copy BBAN-264.aac
Thanks. The archive seems to be missing some subs. First I checked was the MIGD subs, and only one of the three was in the archive.New Year present for fans of subtitles: here is my entire archive of subs from iKoa:
File on MEGA
mega.nz
Kudos goes to admin and bot developer of iKoa. I've attached iKoa's official list which was updated on Nov. 9. However, the archive includes later subs from Nov and Dec as well. There are 8,500+ subs in total.
Happy New Year everyone!
-