Post your JAV subtitle files here - JAV Subtitle Repository (JSP)★NOT A SUB REQUEST THREAD★


Well-Known Member
Jan 13, 2007
I'm afraid I don't understand the text about moving parts of the script in collab.
The script is divided into
#@markdown **Run Whisper**
# @markdown Required settings:
# Generate VAD timestamps
# Add a bit of padding, and remove small gaps
# If breaks are longer than chunk_threshold seconds, split into a new audio file
# This'll effectively turn long transcriptions into many shorter ones
# Merge speech chunks
# Convert timestamps to seconds
# Run Whisper on each audio chunk
Lots of small edits to the above
# DeepL translation

# Write SRT file

I don't understand which 'block' to move and where.
Is it everything in the first block like this:

if "http://" in audio_path or "https://" in audio_path:
print("Downloading audio...")
urllib.request.urlretrieve(audio_path, "input_file")
audio_path = "input_file"
if not os.path.exists(audio_path):
audio_path = uploaded_file
if not os.path.exists(audio_path):
raise ValueError("Input audio not found. Is your audio_path correct?")
except NameError:
raise ValueError("Input audio not found. Did you upload a file?")
out_path = os.path.splitext(audio_path)[0] + ".srt"
out_path_pre = os.path.splitext(audio_path)[0] + ""
if source_separation:
print("Separating vocals...")
!ffprobe -i "{audio_path}" -show_entries format=duration -v quiet -of csv="p=0" > input_length
with open("input_length") as f:
input_length = int(float( + 1
!spleeter separate -d {input_length} -p spleeter:2stems -o output "{audio_path}"
spleeter_dir = os.path.basename(os.path.splitext(audio_path)[0])
audio_path = "output/" + spleeter_dir + "/vocals.wav"
print("Encoding audio...")
if not os.path.exists("vad_chunks"):
print("Running VAD...")
model, utils = torch.hub.load(
repo_or_dir="snakers4/silero-vad:v4.0", model="silero_vad", onnx=False
(get_speech_timestamps, save_audio, read_audio, VADIterator, collect_chunks) = utils

and move it to under
# Write SRT file



Well-Known Member
Jan 13, 2007
Whisper will convert the audio to that internally so it doesn't change anything, it's just ideal to do that if you modify the audio since it's a lossless format so you avoid degrading the audio with an extra lossy conversion.

There was a version issue after something updated on the original colab everyone was using here so one that forces the proper versions was made:

Not sure which you're using but that might change something if it's not that one.

How many lines it detects is a pretty bad indicator since medium might simply split up lines more or something like that or it might just makes a ton of mistakes, but I assume you took a closer look to them when comparing.
Yeah. It's a difference in file size: 16 kb vs 69 or 70 kb.
The settings for line division are off in Faster Whisper, some lines are very long (12 seconds) but I guess I can edit that later.


Grand Wizard
Staff member
Super Moderator
May 10, 2009
You copy everything from the first step code, which is this if you keep the default:
#@title Whisper Transcription Parameters

model_size = "large-v2"  # @param ["large-v3", "large-v2", "medium", "large"]
language = "japanese"  # @param {type:"string"}
translation_mode = "End-to-end Whisper (default)"  # @param ["End-to-end Whisper (default)", "Whisper -> DeepL", "No translation"]
# @markdown VAD settings and DeepL:
deepl_authkey = ""  # @param {type:"string"}
source_separation = False  # @param {type:"boolean"}
vad_threshold = 0.4  # @param {type:"number"}
chunk_threshold = 3.0  # @param {type:"number"}
deepl_target_lang = "EN-US"  # @param {type:"string"}
max_attempts = 1  # @param {type:"integer"}

#@markdown Enter the values for the transcriber parameters. Leave unchanged if not sure.
verbose = False #@param {type:"boolean"}
temperature_input = "0.0" #@param {type:"string"}
compression_ratio_threshold = 2.4 #@param {type:"number"}
logprob_threshold = -1.0 #@param {type:"number"}
no_speech_threshold = 0.6 #@param {type:"number"}
condition_on_previous_text = False #@param {type:"boolean"}
initial_prompt = "" #@param {type:"string"}
word_timestamps = True #@param {type:"boolean"}
clip_timestamps_input = "0" #@param {type:"string"}
hallucination_silence_threshold = 2.0 #@param {type:"number"}

#@markdown Decoding Options (for advanced configurations, leave unchnaged if unsure):
best_of = 2 #@param {type:"number"}
beam_size = 2 #@param {type:"number"}
patience = 1 #@param {type:"number"}
length_penalty = "" #@param {type:"string"}
prefix = "" #@param {type:"string"}
suppress_tokens = "-1" #@param {type:"string"}
suppress_blank = True #@param {type:"boolean"}
without_timestamps = False #@param {type:"boolean"}
max_initial_timestamp = 1.0 #@param {type:"number"}
fp16 = True #@param {type:"boolean"}
# Parsing and converting form inputs
    temperature = tuple(float(temp.strip()) for temp in temperature_input.split(',')) if ',' in temperature_input else float(temperature_input)
except ValueError:
    temperature = (0.0, 0.2, 0.4, 0.6, 0.8, 1.0)  # Default
clip_timestamps = clip_timestamps_input.split(',') if ',' in clip_timestamps_input else clip_timestamps_input
if clip_timestamps != "0":
        clip_timestamps = list(map(float, clip_timestamps)) if isinstance(clip_timestamps, list) else float(clip_timestamps)
    except ValueError:
        clip_timestamps = "0"  # Default if parsing fails
language = None if not language else language
initial_prompt = None if initial_prompt == "" else initial_prompt
length_penalty = None if length_penalty == "" else float(length_penalty)

assert max_attempts >= 1
assert vad_threshold >= 0.01
assert chunk_threshold >= 0.1
assert language != ""
if translation_mode == "End-to-end Whisper (default)":
    task = "translate"
    run_deepl = False
elif translation_mode == "Whisper -> DeepL":
    task = "transcribe"
    run_deepl = True
elif translation_mode == "No translation":
    task = "transcribe"
    run_deepl = False
    raise ValueError("Invalid translation mode")

# Prepare transcription options
transcription_options = {
    "verbose": verbose,
    "compression_ratio_threshold": compression_ratio_threshold,
    "logprob_threshold": logprob_threshold,
    "no_speech_threshold": no_speech_threshold,
    "condition_on_previous_text": condition_on_previous_text,
    "initial_prompt": initial_prompt,
    "word_timestamps": word_timestamps,
    "clip_timestamps": clip_timestamps,
    "hallucination_silence_threshold": hallucination_silence_threshold
# Prepare decoding options
decoding_options = {
    "task": task,
    "language": language,
    "temperature": temperature,
    "best_of": best_of,
    "beam_size": beam_size,
    "patience": patience,
    "length_penalty": length_penalty,
    "prefix": prefix,
    "suppress_tokens": suppress_tokens,
    "suppress_blank": suppress_blank,
    "without_timestamps": without_timestamps,
    "max_initial_timestamp": max_initial_timestamp,
    "fp16": fp16,

And then put it at the start of the last step code block which is:
#@markdown **Run Whisper**
# @markdown Required settings:
audio_path = "/content/drive/MyDrive/test.wav"  # @param {type:"string"}
assert audio_path != ""

import tensorflow as tf
import torch
import whisper
import os
import ffmpeg
import srt
from tqdm import tqdm
import datetime
import deepl
import urllib.request
import json
from google.colab import files

if "http://" in audio_path or "https://" in audio_path:
    print("Downloading audio...")
    urllib.request.urlretrieve(audio_path, "input_file")
    audio_path = "input_file"

*rest of the code here*

So you end up with:
#@title Whisper Transcription Parameters

model_size = "large-v2"  # @param ["large-v3", "large-v2", "medium", "large"]
language = "japanese"  # @param {type:"string"}
translation_mode = "End-to-end Whisper (default)"  # @param ["End-to-end Whisper (default)", "Whisper -> DeepL", "No translation"]
# @markdown VAD settings and DeepL:
deepl_authkey = ""  # @param {type:"string"}
source_separation = False  # @param {type:"boolean"}
vad_threshold = 0.4  # @param {type:"number"}
chunk_threshold = 3.0  # @param {type:"number"}
deepl_target_lang = "EN-US"  # @param {type:"string"}
max_attempts = 1  # @param {type:"integer"}

#@markdown Enter the values for the transcriber parameters. Leave unchanged if not sure.
verbose = False #@param {type:"boolean"}
temperature_input = "0.0" #@param {type:"string"}
compression_ratio_threshold = 2.4 #@param {type:"number"}
logprob_threshold = -1.0 #@param {type:"number"}
no_speech_threshold = 0.6 #@param {type:"number"}
condition_on_previous_text = False #@param {type:"boolean"}
initial_prompt = "" #@param {type:"string"}
word_timestamps = True #@param {type:"boolean"}
clip_timestamps_input = "0" #@param {type:"string"}
hallucination_silence_threshold = 2.0 #@param {type:"number"}

#@markdown Decoding Options (for advanced configurations, leave unchnaged if unsure):
best_of = 2 #@param {type:"number"}
beam_size = 2 #@param {type:"number"}
patience = 1 #@param {type:"number"}
length_penalty = "" #@param {type:"string"}
prefix = "" #@param {type:"string"}
suppress_tokens = "-1" #@param {type:"string"}
suppress_blank = True #@param {type:"boolean"}
without_timestamps = False #@param {type:"boolean"}
max_initial_timestamp = 1.0 #@param {type:"number"}
fp16 = True #@param {type:"boolean"}
# Parsing and converting form inputs
    temperature = tuple(float(temp.strip()) for temp in temperature_input.split(',')) if ',' in temperature_input else float(temperature_input)
except ValueError:
    temperature = (0.0, 0.2, 0.4, 0.6, 0.8, 1.0)  # Default
clip_timestamps = clip_timestamps_input.split(',') if ',' in clip_timestamps_input else clip_timestamps_input
if clip_timestamps != "0":
        clip_timestamps = list(map(float, clip_timestamps)) if isinstance(clip_timestamps, list) else float(clip_timestamps)
    except ValueError:
        clip_timestamps = "0"  # Default if parsing fails
language = None if not language else language
initial_prompt = None if initial_prompt == "" else initial_prompt
length_penalty = None if length_penalty == "" else float(length_penalty)

assert max_attempts >= 1
assert vad_threshold >= 0.01
assert chunk_threshold >= 0.1
assert language != ""
if translation_mode == "End-to-end Whisper (default)":
    task = "translate"
    run_deepl = False
elif translation_mode == "Whisper -> DeepL":
    task = "transcribe"
    run_deepl = True
elif translation_mode == "No translation":
    task = "transcribe"
    run_deepl = False
    raise ValueError("Invalid translation mode")

# Prepare transcription options
transcription_options = {
    "verbose": verbose,
    "compression_ratio_threshold": compression_ratio_threshold,
    "logprob_threshold": logprob_threshold,
    "no_speech_threshold": no_speech_threshold,
    "condition_on_previous_text": condition_on_previous_text,
    "initial_prompt": initial_prompt,
    "word_timestamps": word_timestamps,
    "clip_timestamps": clip_timestamps,
    "hallucination_silence_threshold": hallucination_silence_threshold
# Prepare decoding options
decoding_options = {
    "task": task,
    "language": language,
    "temperature": temperature,
    "best_of": best_of,
    "beam_size": beam_size,
    "patience": patience,
    "length_penalty": length_penalty,
    "prefix": prefix,
    "suppress_tokens": suppress_tokens,
    "suppress_blank": suppress_blank,
    "without_timestamps": without_timestamps,
    "max_initial_timestamp": max_initial_timestamp,
    "fp16": fp16,

#@markdown **Run Whisper**
# @markdown Required settings:
audio_path = "/content/drive/MyDrive/test.wav"  # @param {type:"string"}
assert audio_path != ""

import tensorflow as tf
import torch
import whisper
import os
import ffmpeg
import srt
from tqdm import tqdm
import datetime
import deepl
import urllib.request
import json
from google.colab import files

if "http://" in audio_path or "https://" in audio_path:
    print("Downloading audio...")
    urllib.request.urlretrieve(audio_path, "input_file")
    audio_path = "input_file"

*rest of the code here*


Well-Known Member
Jan 13, 2007
Thanks. But no, that's not how the default script on that address looked from what I remember.
I'll check back when I've finished what I mention below.
I'll try that script later.

The new/old collab link you sent me seems to work. It does take a lot more time than the old one, though. It took forever before it started checking the audio and dividing it (in 375 chunks). It's at 81% after 30 minutes. Well, it's a 3 hr video.

Thank you so much for your time. I won't hold you up any longer. Have a nice day.


Active Member
Jul 24, 2021
Thanks. But no, that's not how the default script on that address looked from what I remember.
I'll check back when I've finished what I mention below.
I'll try that script later.

The new/old collab link you sent me seems to work. It does take a lot more time than the old one, though. It took forever before it started checking the audio and dividing it (in 375 chunks). It's at 81% after 30 minutes. Well, it's a 3 hr video.

Thank you so much for your time. I won't hold you up any longer. Have a nice day.
I downloaded my adapted script. Just upload it to your colab. Works slightly different. Make sure to change this line in the code to the path where you store the audio: audio_folder = "/content/drive/MyDrive/Colab Notebooks/"
It will pick the audio from there and run it.


  • WhisperWithVAD_pro.rar
    9.1 KB · Views: 16
  • Like
Reactions: ericf


Grand Wizard
Staff member
Super Moderator
May 10, 2009


Well-Known Member
Sep 13, 2019

[Reducing Mosaic]JUR-007 Every Night, I Heard The Moans Of The Woman Next Door, And I Was Worried About It… ~A Sweaty Afternoon With A Frustrated Married Woman~ Rena Fukiishi​


I downloaded this recent, reduced mosaic JAV starring one of my favorite MILFs. I used Purfview's Whisper in Subtitle Edit (using model V2-Large) to create this Sub. I also attempted to clean it up a bit and re-interpreted some of the meaningless/ "lewd-less" dialog. Again, I don't understand Japanese so my re-interpretations might not be totally accurate but I try to match what's happening in the scene. Anyway, enjoy and let me know what you think.​

Not much of a storyline nor is it an Incest themed movie but It is Rena Fukiishi!​



  • jur-007-ub.rar
    12.5 KB · Views: 24
Last edited:


Well-Known Member
Jan 13, 2007
That is the link I copy pasted the code from so it should be identical, the first block(Whisper Transcription Parameters) code is hidden by default so you have to make it show up first.
Ahh! I see now. I was looking at the parameters at Run Whisper only.

So I put all that code before this?:

if not os.path.exists(audio_path):
audio_path = uploaded_file
if not os.path.exists(audio_path):
raise ValueError("Input audio not found. Is your audio_path correct?")
except NameError:
raise ValueError("Input audio not found. Did you upload a file?")

Or do I remove the else: part since it refers back to
if "http://" in audio_path or "https://" in audio_path:
You see how little I know about scripts?
Thanks for the information.


Grand Wizard
Staff member
Super Moderator
May 10, 2009
Remove nothing, just put the whole thing at the beginning.

It's just variable declarations so it just needs to be there and not overlap with another line.

There might be a better way to use the colab, but that's just my solution for not having looked into the issue more than a few secs.
  • Like
Reactions: ericf


Well-Known Member
Sep 13, 2019

[Reducing Mosaic]MEYD-090 Was Being × Up To Father-in-law While You Are Not … Ayumi Shinoda​


I downloaded this recent, reduced mosaic JAV starring my favoritest MILF. I used Purfview's Whisper in Subtitle Edit (using model V2-Large) to create this Sub. I also attempted to clean it up a bit and re-interpreted some of the meaningless/ "lewd-less" dialog. Again, I don't understand Japanese so my re-interpretations might not be totally accurate but I try to match what's happening in the scene. Anyway, enjoy and let me know what you think.​

Not my preferred flavor of Incest JAV, but it was Ayumi Shinoda and it was pretty erotic.​



  • meyd-090-ub.rar
    10.5 KB · Views: 20
  • Like
Reactions: johnny fontane