Hey anyone found more open depository from SLF or SHT or any? Runbkk is still the best source for me, but he's gotten less best in past year. E.g. I haven't found any soft subs for recent Das! (DASD-) titles.
In other news, as I've been put in isolation camp
I have spent some time pluning and cleaning my collection of soft subs. E.g. further deleting 2100+ files due to duplications (many are byte-by-byte copies, others are same subtitles in different formats SRT/VTT/ASS etc) but there remains still some duplicates that I'm still too lazy to weed out, e.g. tiny offset in timing, simplified vs traditional Chinese. Oh yeah... most of the soft-sub files are Chinese.
I've finally written a little program to detect Korean in subtitle files. So I've isolated the Korean files (now standing at 1330+ files). So if anyone want to do something with a (small?) cache of KR subs, you can PM me. In terms of programming it's fun little exercise, I scanned a few hundred Korean SRT files and extracted the 60 most common Korean words and test a file is Korean if those 60 words account for more than 3% of the file (in fact, I found those 60 words account for over 25% of any piece of Korean text with very very high probability).
It's a long and tedious effort and very minimal reward. The filenames are a huge mess (this is pretty much 100% fixed) , many files are incorrectly formatted, and older files were encoded with GBK, Big5 etc, these I've been correcting by hand when found.
There are still some files (140+ files) with mystery encoding? Could they be Thai?
Luckily with Runbkk and SHT's recent releases (post 2020-ish) all these issues were fixed, filenames are sensible, encoding is UTF-8, formats are 100% compliant.