Demux it. Google how for your specific extension or use mkvtoolnix to put just the audio in an mka.Could somebody advise the best way to rip audio from vids? I think it was mentioned further back in the thread, but I'm having trouble finding it.
Demux it. Google how for your specific extension or use mkvtoolnix to put just the audio in an mka.Could somebody advise the best way to rip audio from vids? I think it was mentioned further back in the thread, but I'm having trouble finding it.
Is anyone using a script do auto sub srt with deepl (free version)? Mine broke and I can't seem to fix it so I am looking for an alternative or someone more technical than me who knows how to fix it.
Thanks a lot! Everything seems to be working but deepl changed the input css selector, which I fixed but then for some reason it messed with the merge of the chunks into one file. All srt which only have one chunk work, but as soon as more than one is required, it fails.Post your script and I can see if it's easy to fix or not.
Thanks for explaining how to do it. I tried to translate one, despite whisper's translation being much better than other auto translatorsthere were some obvious wrong translations. I tried to clean it up as much as I can. There were few 15-20 second intervals which didn't have any translation maybe sound wasn't loud enough for whisper to pick it up, do you have any suggestions on that? Anyway I tried to fill those intervals as best as I can. Here is my first attempt at translating a JAV. I would appreciate feedbacks.Use the virtual Whisper+VAD version (it is not necessary to use your computer resources) use MP3 files, I recommend M4A, if you have any audio enhancement program turn it up adjust the volume (I use MOVAVI), from page 218 onwards You can find some discussions of using Whisper (tutorial) you can see my posts where I use images to guide you, good luck.
Some (complete) movies take about 15 minutes to 1 hour, but the result is much better than the other options that already exist (that's my opinion, I'm sure someone else contradicts me), you can use the program up to 6 times a day. sometimes a little less. (free has a limit)
Wow, that is indeed a large amount of file. Thanks!If anyone is intersted, I have a large amount of Whisper generated subtitles. They are NOT cleaned, because I am only interested in understanding the story, not having perfect subs. Duplicates are removed as good as possible and timings slightly fixed.
This usually happens, when I'm really interested in knowing what happens in a movie scene, I play the movie through Whisper+VAD again, modifying the VAD values (by default it comes in 0.4) I've set it to 0.3, 0.5 and 0.6 the The text lines are usually the same, but it has detected some lines that I had not detected before, then I pass the .srt file to the Subtitle Edit program and add those lines to the best srt file that I consider to be more complete (I only do this with movies that are really worth trying, that I like a lot)Thanks for explaining how to do it. I tried to translate one, despite whisper's translation being much better than other auto translatorsthere were some obvious wrong translations. I tried to clean it up as much as I can. There were few 15-20 second intervals which didn't have any translation maybe sound wasn't loud enough for whisper to pick it up, do you have any suggestions on that? Anyway I tried to fill those intervals as best as I can. Here is my first attempt at translating a JAV. I would appreciate feedbacks.
RCT-925 Studio ROCKET The Cum Swallowing Game Try To Guess Which Shot Of Cum Belongs To Your Boyfriend!
I see u use movavi to balance audio and boost volume, does that make a big diff in dialogue output? are you gettin way more text?This usually happens, when I'm really interested in knowing what happens in a movie scene, I play the movie through Whisper+VAD again, modifying the VAD values (by default it comes in 0.4) I've set it to 0.3, 0.5 and 0.6 the The text lines are usually the same, but it has detected some lines that I had not detected before, then I pass the .srt file to the Subtitle Edit program and add those lines to the best srt file that I consider to be more complete (I only do this with movies that are really worth trying, that I like a lot)
The other option I do is increase the volume of the file and balance the audio (there are programs that do this automatically)
Be aware that whisper is random. It won't give you the same amount of lines if you rerun the exact same audio unmodified in my experience so comparing like this is mostly meaningless.I see u use movavi to balance audio and boost volume, does that make a big diff in dialogue output? are you gettin way more text?
Thanks a lot! Everything seems to be working but deepl changed the input css selector, which I fixed but then for some reason it messed with the merge of the chunks into one file. All srt which only have one chunk work, but as soon as more than one is required, it fails.
"css selector"
By.CSS_SELECTOR
with open(mysrt, 'r', encoding='utf-8') as f:
srt = f.read()
match = re.findall(r'\d+:\d+:\d+,\d+ --> \d+:\d+:\d+,\d+', srt)
linerList = []
liner = ""
with open(wordtxt, "r", encoding="utf-8", errors='ignore') as wordfile:
lines = wordfile.readlines()
for line in lines:
if line != '\n' and line is not lines[-1]:
liner += line
elif line != '\n' and len(linerList) == len(match)-1:
liner += line
linerList.append(liner)
break
else:
linerList.append(liner)
liner = ""
count = 0
with open(finalsrt, 'w', encoding='utf-8') as resfile:
for timeline in match:
resfile.write(f"{count+1}\n")
resfile.write(timeline+'\n')
resfile.write(linerList[count])
resfile.write("\n")
count += 1
# Start a Selenium driver
driver_path=r'C:\Program Files (x86)\chromedriver.exe'
driver = webdriver.Chrome(driver_path)
s = Service("C:\Program Files (x86)\chromedriver.exe")
chrome_options = Options()
chrome_options.add_argument("--ignore-ssl-errors")
# The following 2 are probably unnecessary but doesn't hurt
chrome_options.add_argument("--ignore-certificate-errors")
chrome_options.AcceptInsecureCertificates = True
# Gets rid of USB error
chrome_options.add_experimental_option('excludeSwitches', ['enable-logging'])
# Start a Selenium driver with chosen chrome options
driver = webdriver.Chrome(service=s, options=chrome_options)
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
Thanks for looking into it. That was exactly the issue. It merges the file, but combines the last 2 rows of a chunk together, so the whole thing is not usable. Thanks anywaysI had to change
withCode:"css selector"
in 3 places for it to work for me.Code:By.CSS_SELECTOR
What do you mean exactly by it fails?
I've tried it with a 5 chunk one and it did produce a combined SRT but it does seem to skip the last line of each chunk(hard to say for sure though), unless something else caused those in the 1 long srt I tested so wondering if your issue is the same and I should look further into it or if it was something else.
Edit: I see that it does throw an error when combining but it does still create the file mostly(minus a few lines) but I don't know how it acted before.
Edit2: If you also get similar result than I do, basically the following portion of code(that merges the chunks together) throws some kind of error but I don't know python enough to debug so hard to tell what exactly does:
Code:with open(mysrt, 'r', encoding='utf-8') as f: srt = f.read() match = re.findall(r'\d+:\d+:\d+,\d+ --> \d+:\d+:\d+,\d+', srt) linerList = [] liner = "" with open(wordtxt, "r", encoding="utf-8", errors='ignore') as wordfile: lines = wordfile.readlines() for line in lines: if line != '\n' and line is not lines[-1]: liner += line elif line != '\n' and len(linerList) == len(match)-1: liner += line linerList.append(liner) break else: linerList.append(liner) liner = "" count = 0 with open(finalsrt, 'w', encoding='utf-8') as resfile: for timeline in match: resfile.write(f"{count+1}\n") resfile.write(timeline+'\n') resfile.write(linerList[count]) resfile.write("\n") count += 1
Edit3: And to get rid of various warnings and errors, you can replace:
withCode:# Start a Selenium driver driver_path=r'C:\Program Files (x86)\chromedriver.exe' driver = webdriver.Chrome(driver_path)
And add the following at the beginning(I did after line 3):Code:s = Service("C:\Program Files (x86)\chromedriver.exe") chrome_options = Options() chrome_options.add_argument("--ignore-ssl-errors") # The following 2 are probably unnecessary but doesn't hurt chrome_options.add_argument("--ignore-certificate-errors") chrome_options.AcceptInsecureCertificates = True # Gets rid of USB error chrome_options.add_experimental_option('excludeSwitches', ['enable-logging']) # Start a Selenium driver with chosen chrome options driver = webdriver.Chrome(service=s, options=chrome_options)
Code:from selenium.webdriver.chrome.options import Options from selenium.webdriver.chrome.service import Service
Wow, thanks for the great collection!If anyone is intersted, I have a large amount of Whisper generated subtitles. They are NOT cleaned, because I am only interested in understanding the story, not having perfect subs. Duplicates are removed as good as possible and timings slightly fixed.
I'm not sure if this helps, I'm not an expert, I'm just mentioning that possibly this helps to make the dialog of the audio files easier to detect by the VAD program, so it will detect the subtitles a little better, I've I verified that turning up the volume detected phrases that I had not seen before in previous tests, since I used MP3 before, then I went to use M4A and increase the audio, eliminate noise, etc., in some movies the result is much better, but This can only be my guess, so take all you can from the information on the forum and do your own testing.I see u use movavi to balance audio and boost volume, does that make a big diff in dialogue output? are you gettin way more text?
I am having problems with Whisper on the collab, might be my computer, it's been weird lately. But I setup Whisper, upload a file, run Whisper and..... nothing. I get the "executing" text at the bottom but the clock just runs. After a while I get this message
What does this mean? If I change to standard runtime, I get a time out error after a while. Any ideas? Thanks in advance!!
This error occurs due to the inactivity detected by Whisper, so I recommend that you are doing something within the page, this can happen during the upload of our file, which depending on the size can take between 2 or 5 minutes, sometimes we do Easy to go to see something else in which the file is loaded, the collab page detects inactivity, you have to re-enter the execution environment menu and activate it again.I am having problems with Whisper on the collab, might be my computer, it's been weird lately. But I setup Whisper, upload a file, run Whisper and..... nothing. I get the "executing" text at the bottom but the clock just runs. After a while I get this message
View attachment 3153964
What does this mean? If I change to standard runtime, I get a time out error after a while. Any ideas? Thanks in advance!!