akiba resident JAV subtitlers & subtitle talk★NOT A SUB REQUEST THREAD★

Taako · Dec 6, 2022

Prinsipe said:
Hi - how can you make the audio file (mp3) with Audicity to have better-transcribing results? I mean, what do you edit or tweak to make it better recognized by whisper / pytranscriber / or any software you use to transcribe?

My simple advice is increase the volume to reasonable levels, add bass reasonably, listen to it, then export to mp3, and then used that mp3 for pytranscriber. I used this alot with some personal settings.

My difficult advice have you break the scenes using muxtool, and then used the simple advice. When I say scenes... I mean scenes that are usually 20-40 min long.
My difficult advice#2 I won't even recommended even though it's not super difficult, as it's annoying

I abandon long ago as it takes too long but it works.

Remember NOTHING is 100% perfect and I used pytranscriber mostly for time stamps.

TmpGuy · Dec 11, 2022

I've posted this before, but my posts got wiped out due to reasons (long story). I'm posting it again in case it helps anyone.

Essentially, it's a Japanese to English spreadsheet (csv format) of common words and phrases I learned when translating over a dozen titles. The Japanese phrases are written and sorted in romaji (Japanese written using English characters). There is the romaji word or phrase, a literal translation, an alternate translation (in case the phrase is a euphemism), and notes.

I'm far from literate in Japanese, so these are all based on my own research and best guesses based on context. If anyone spots flaws or inconsistencies, or wants to add their own words or phrases, let me know. Also, I translate lesbian titles, so the phrases are skewed in that direction.

maload · Dec 11, 2022

TmpGuy said:
I've posted this before, but my posts got wiped out due to reasons (long story). I'm posting it again in case it helps anyone.

Essentially, it's a Japanese to English spreadsheet (csv format) of common words and phrases I learned when translating over a dozen titles. The Japanese phrases are written and sorted in romaji (Japanese written using English characters). There is the romaji word or phrase, a literal translation, an alternate translation (in case the phrase is a euphemism), and notes.

I'm far from literate in Japanese, so these are all based on my own research and best guesses based on context. If anyone spots flaws or inconsistencies, or wants to add their own words or phrases, let me know. Also, I translate lesbian titles, so the phrases are skewed in that direction.

thank you so much

Taako · Dec 11, 2022

TmpGuy said:
I've posted this before, but my posts got wiped out due to reasons (long story). I'm posting it again in case it helps anyone.

Essentially, it's a Japanese to English spreadsheet (csv format) of common words and phrases I learned when translating over a dozen titles. The Japanese phrases are written and sorted in romaji (Japanese written using English characters). There is the romaji word or phrase, a literal translation, an alternate translation (in case the phrase is a euphemism), and notes.

I'm far from literate in Japanese, so these are all based on my own research and best guesses based on context. If anyone spots flaws or inconsistencies, or wants to add their own words or phrases, let me know. Also, I translate lesbian titles, so the phrases are skewed in that direction.

Those are indeed some very common words in JAV.
I have created such a spreadsheet on notepad 2 and really need to organize mines. It's why I have released it.
I even started alphabetically putting them in order. I have over 2000 words, some are variation of the same words(spelled different because how the actors say it) and given the context on screen.
For example: I have many variation for Nande(kore, gurai, kana, koko, and etc) this is why list is just big, messy, and insane! hahaha
Thank you for your list

rickson · Dec 12, 2022

I'm testing whisper ai and is looking promising

Taako · Dec 13, 2022

klako said:
Anyone can recommend JAV where the eng sub significantly change the story? Basically the story is hard to predict if you don't understand what they are saying. Implied there is accurate eng sub, and the story is unique/unexpected, not one of the standard scenario.
-------------------------------------------------------------------------------------------------------------------
I don't get your question? Are you asking if some eng subs made the original story different?
If that's what you're asking...than I will say yes. Machine translations can mess up an original movie pretty bad.

Also, unless you're Japanese, the subs are never gonna 100% accurate.

xsf27 · Dec 15, 2022

Long time lurker here who is immensely grateful for all of the great work that all you talented and hard working JAV subtitlers have done! It really does add a new level of enjoyment when you actually understand what they are saying, especially in the drama-driven films which I particularly enjoy.

Now, I've been meaning to get off my butt and try my hand at attempting to make some subs for some of my favourite films and hopefully if all goes well, I'll share some of my work here. This is coming from a non-Japanese speaker so the challenges are quite numerous, as you would imagine. Thankfully, there is an abundance of AI-driven methods out there, with varying accuracy.

This takes me to the point of my post - and I apologise if it is something that has been mentioned already - which is whether anyone has tried to use Adobe Premiere Pro's Speech-To-Text captioning tools yet? If so, I'd be interested to know how it compares with other more popular or proven methods of creating captions from the speech of a video from scratch without hard or soft-coded subtitles already available?

I have only run a preliminary test (thankfully Japanese was one of the 13 languages supported), and it seems to work for the most part, but there is a lot of polishing to be done. Now, it only extracts the captions in the language spoken (it won't autodetect which language is spoken, you must specify) so the resulting transcription must be translated afterwards (for which I just use good ol' Google Translate).

So it looks promising, but I'm just currently racking my brain as to how to export the resulting transcription into a conventionally-recognised subtitle file format, short of painstakingly going through it to make the edits myself, so if anyone who is more versed in it can point me in the right direction, I'd greatly appreciate it!

soloporhoy666 · Dec 16, 2022

xsf27 said:
Long time lurker here who is immensely grateful for all of the great work that all you talented and hard working JAV subtitlers have done! It really does add a new level of enjoyment when you actually understand what they are saying, especially in the drama-driven films which I particularly enjoy.

Now, I've been meaning to get off my butt and try my hand at attempting to make some subs for some of my favourite films and hopefully if all goes well, I'll share some of my work here. This is coming from a non-Japanese speaker so the challenges are quite numerous, as you would imagine. Thankfully, there is an abundance of AI-driven methods out there, with varying accuracy.

This takes me to the point of my post - and I apologise if it is something that has been mentioned already - which is whether anyone has tried to use Adobe Premiere Pro's Speech-To-Text captioning tools yet? If so, I'd be interested to know how it compares with other more popular or proven methods of creating captions from the speech of a video from scratch without hard or soft-coded subtitles already available?

I have only run a preliminary test (thankfully Japanese was one of the 13 languages supported), and it seems to work for the most part, but there is a lot of polishing to be done. Now, it only extracts the captions in the language spoken (it won't autodetect which language is spoken, you must specify) so the resulting transcription must be translated afterwards (for which I just use good ol' Google Translate).

So it looks promising, but I'm just currently racking my brain as to how to export the resulting transcription into a conventionally-recognised subtitle file format, short of painstakingly going through it to make the edits myself, so if anyone who is more versed in it can point me in the right direction, I'd greatly appreciate it!

I am currently using Whisper, an AI technology, with more precision in the transcription of the dialogue of the movies, it can also be used on computers and in my case remotely, that is to say virtually without using my computer resources, to improve some programmers still including technologies such as VAD audio improvement, which greatly improves the quality of the dialogue lines, even better it does not create cut or invented words or that do not exist, logically it is not perfect but today it is better than autosu or some other system, It also gives you the .srt file in the language or if you prefer in English.
Here is an example of what AI Whisper technology can achieve.

Taako · Dec 16, 2022

soloporhoy666 said:
I am currently using Whisper, an AI technology, with more precision in the transcription of the dialogue of the movies, it can also be used on computers and in my case remotely, that is to say virtually without using my computer resources, to improve some programmers still including technologies such as VAD audio improvement, which greatly improves the quality of the dialogue lines, even better it does not create cut or invented words or that do not exist, logically it is not perfect but today it is better than autosu or some other system, It also gives you the .srt file in the language or if you prefer in English.
Here is an example of what AI Whisper technology can achieve.
View attachment 3119103 View attachment 3119104

Seems promising.
1. How does it handle multiple people talking at once?
2. Background noises overrides the speaker? Most JAV has the inner dialogue of the actress/actor talking to themselves and with music(piano music or whatever). Or music introduction between scenes.
3. Low talking. The actress talks softly or the sound person(mic) is too far away.

xsf27 · Dec 16, 2022

soloporhoy666 said:
I am currently using Whisper, an AI technology, with more precision in the transcription of the dialogue of the movies, it can also be used on computers and in my case remotely, that is to say virtually without using my computer resources, to improve some programmers still including technologies such as VAD audio improvement, which greatly improves the quality of the dialogue lines, even better it does not create cut or invented words or that do not exist, logically it is not perfect but today it is better than autosu or some other system, It also gives you the .srt file in the language or if you prefer in English.
Here is an example of what AI Whisper technology can achieve.
View attachment 3119103 View attachment 3119104

Thanks for the heads up, I'll definitely check it out. I must say, though, that it does indeed look so much more polished that with my rudimentary attempts with Adobe Premiere Pro.

Although after reading @Taako's comments above, I left out the part where Adobe automatically transcribes the results with multiple speakers, the cast of which is automatically populated but later editable after the captions are finished.

However, it's accuracy (transcription quality and recognition of different speakers) is something I've yet assessed properly, but it does give me an excuse to peruse through my favourite JAV again lol!

I'll report back once I can get a more definitive answer to these questions, although the quality of the transcription can be better assessed by more (or less non-existent) Japanese linguists.

This brings me to another question about Whisper AI (which, I gather, is currently the gold-standard in automatic transcription) and that is whether it is competent at recognising slang or colloquial terms, not to mention the many expletives prevalent in such 'exotic' movies lol?

soloporhoy666 · Dec 16, 2022

Taako said:
Seems promising.
1. How does it handle multiple people talking at once?
2. Background noises overrides the speaker? Most JAV has the inner dialogue of the actress/actor talking to themselves and with music(piano music or whatever). Or music introduction between scenes.
3. Low talking. The actress talks softly or the sound person(mic) is too far away.

Hello, the Whisper artificial intelligence system is really new, it is open source, so those people who have the knowledge will be able to add improvements to the AI, like the following example, there is currently a version that adds VAD, I understand this system improves the audio, that added to whisper has given me results than with any other similar program, like the example that I uploaded, that film is recorded outdoors, there are between 3 and 6 people at a time, the Collab, you can also find it in this page.

Taako · Dec 16, 2022

xsf27 said:
Thanks for the heads up, I'll definitely check it out. I must say, though, that it does indeed look so much more polished that with my rudimentary attempts with Adobe Premiere Pro.

Although after reading @Taako's comments above, I left out the part where Adobe automatically transcribes the results with multiple speakers, the cast of which is automatically populated but later editable after the captions are finished.

However, it's accuracy (transcription quality and recognition of different speakers) is something I've yet assessed properly, but it does give me an excuse to peruse through my favourite JAV again lol!

I'll report back once I can get a more definitive answer to these questions, although the quality of the transcription can be better assessed by more (or less non-existent) Japanese linguists.

This brings me to another question about Whisper AI (which, I gather, is currently the gold-standard in automatic transcription) and that is whether it is competent at recognising slang or colloquial terms, not to mention the many expletives prevalent in such 'exotic' movies lol?

thanks, I can't wait to hear the results.

Taako · Dec 16, 2022

soloporhoy666 said:
Hello, the Whisper artificial intelligence system is really new, it is open source, so those people who have the knowledge will be able to add improvements to the AI, like the following example, there is currently a version that adds VAD, I understand this system improves the audio, that added to whisper has given me results than with any other similar program, like the example that I uploaded, that film is recorded outdoors, there are between 3 and 6 people at a time, the Collab, you can also find it in this page.

1. So does it work when multiple speakers...talking at the same?
2. Does it work when the speaker is talking low?
3. Does it work if the sound quality of the movie is low?
4. What movie did you test it on?
5. Did you try a difficult movie with a group talking such as rctd-459. This has a sub...
but i remember there's scenes between "sisters" talking, while the "brother" have sex with mom. The brother and mom would talk as well.
In this movie, the mom and daughters is unaware. So the talking continues as if the brother is not there.
I wonder how Whisper would handle it?

soloporhoy666 · Dec 16, 2022

Taako said:
1. So does it work when multiple speakers...talking at the same?
2. Does it work when the speaker is talking low?
3. Does it work if the sound quality of the movie is low?
4. What movie did you test it on?
5. Did you try a difficult movie with a group talking such as rctd-459. This has a sub...
but i remember there's scenes between "sisters" talking, while the "brother" have sex with mom. The brother and mom would talk as well.
In this movie, the mom and daughters is unaware. So the talking continues as if the brother is not there.
I wonder how Whisper would handle it?

to answer your other question, with the sound or the voice of the actress, yesterday I transcribed a low-quality film, both audio and video, it was SD quality at 480, the actress speaks in a low voice in several of the scenes and even so, the whisper+VAD program managed to obtain many clean lines of text, the AI is programmed to give you real text, it does not put invented text or invent words that do not exist, in any case it leaves some spaces without translation, obviously it is not perfect I have detected that sometimes, mainly in sex scenes, he repeats some words when he was unable to detect the audio, but this was before using the version with VAD, in conclusion up to now it is the best system that I have tried and it has left me satisfied, passing from 200 line subtitles to 500+ lines depending on the movie, most are throwing me that amount in others even many more, the record is 1700 lines and the best thing is that this can only get better, the developments lladores formed at some point in the TESLA company.
and the best, it's free.

Taako · Dec 16, 2022

soloporhoy666 said:
to answer your other question, with the sound or the voice of the actress, yesterday I transcribed a low-quality film, both audio and video, it was SD quality at 480, the actress speaks in a low voice in several of the scenes and even so, the whisper+VAD program managed to obtain many clean lines of text, the AI is programmed to give you real text, it does not put invented text or invent words that do not exist, in any case it leaves some spaces without translation, obviously it is not perfect I have detected that sometimes, mainly in sex scenes, he repeats some words when he was unable to detect the audio, but this was before using the version with VAD, in conclusion up to now it is the best system that I have tried and it has left me satisfied, passing from 200 line subtitles to 500+ lines depending on the movie, most are throwing me that amount in others even many more, the record is 1700 lines and the best thing is that this can only get better, the developments lladores formed at some point in the TESLA company.
and the best, it's free.

Thank you. It sounds promising. I will wait to see how others might like it.
Did you provide a link on how best to run this program? Online or download? Is a certain system requirement is needed?

If it can do well on rctd-459 as a test, I think it might be good. It's not a movie I like but it would be a good a test subject.

soloporhoy666 · Dec 16, 2022

Taako said:
Thank you. It sounds promising. I will wait to see how others might like it.
Did you provide a link on how best to run this program? Online or download? Is a certain system requirement is needed?

If it can do well on rctd-459 as a test, I think it might be good. It's not a movie I like but it would be a good a test subject.

Later I will upload the file or address of the Collab again, since it works from the cloud, only that I am at home and with my computer to upload the address.

soloporhoy666 · Dec 16, 2022

Taako said:
Thank you. It sounds promising. I will wait to see how others might like it.
Did you provide a link on how best to run this program? Online or download? Is a certain system requirement is needed?

If it can do well on rctd-459 as a test, I think it might be good. It's not a movie I like but it would be a good a test subject.

I am making all the subtitles for my films that I like, which it seems would never have subtitles, some because they are old and others lack interest, but now I am enjoying translating and finally being able to know what those stories say.

Taako · Dec 16, 2022

soloporhoy666 said:
I am making all the subtitles for my films that I like, which it seems would never have subtitles, some because they are old and others lack interest, but now I am enjoying translating and finally being able to know what those stories say.

Indeed. Which is why I sub my own. Always have. Always will

Taako · Dec 16, 2022

soloporhoy666 said:
Later I will upload the file or address of the Collab again, since it works from the cloud, only that I am at home and with my computer to upload the address.

thanks. There's no rush. I'm in the middle subbing 4 movies currently :cihuy:

soloporhoy666 · Dec 17, 2022

Taako said:
Thank you. It sounds promising. I will wait to see how others might like it.
Did you provide a link on how best to run this program? Online or download? Is a certain system requirement is needed?

If it can do well on rctd-459 as a test, I think it might be good. It's not a movie I like but it would be a good a test subject.

Well, I leave you some images of what was a test, so for now it is a new record, more than 2000 lines of text, very good, it is not perfect, there are some lines that the program repeats, but they are minimal, I hope you are back soon dark side and start creating more subtitles with this new system, I also leave the collab link.

Google Colab

colab.research.google.com

akiba resident JAV subtitlers & subtitle talk★NOT A SUB REQUEST THREAD★

Akiba Citizen

JavLuv author, lesbian connoisseur

Attachments

Active Member

Akiba Citizen

Active Member

Akiba Citizen

New Member

Active Member

Akiba Citizen

New Member

Active Member

Akiba Citizen

Akiba Citizen

Active Member

Akiba Citizen

Active Member

Active Member

Akiba Citizen

Akiba Citizen

Active Member

Similar threads