Optical Character Recognition (OCR) for natural images/scenes

他也不是故意要那么做的
Another option to recognize text from images, it can recognize directly from natural images, and it recognizes better than tesseract
Requirements:
Windows 10 64 bits3 GB hard disk
8 GB RAM
Internet connection, to download the required languages only once
Software
Python
Pytorch

PyTorch
EasyOCR (IA)
GitHub - JaidedAI/EasyOCR: Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc. - JaidedAI/EasyOCR
videosubfinder
VideoSubFinder
Download VideoSubFinder for free. The main purpose of this program is to provide functionality for extract hardcoded subtitles (hardsub) from video. It provides two main features: 1) Autodetection of frames with hardcoded text (hardsub) on video with saving info about timing positions.

Installation
Download and Install Python 3.9.4 in C:\python39
python-3.9.4
Run PIP for Python
Open line command win+R and "cmd"
Bash:
C:\Python29\Scripts\pip.exe install easyocr
There are two options, if you have the Nvidia video card run step a), if you have only AMD/intel video card run step b)
a) Only for video cards CUDA/NVIDIA 11.1
Bash:
C:\Python29\Scripts\pip3.exe install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio===0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
Code:
C:\Python29\Scripts\pip3.exe install torch==1.8.1+cpu torchvision==0.9.1+cpu torchaudio===0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
Download and install in C:\VideoSubFinder5x64 (rename of release_x64)
VideoSubfinder 5.5
For working of this program will be required "Microsoft Visual C++ Redistributable for Visual Studio 2015, 2017 and 2019"
Usage
Run VideoSubFinderWXW.exeOpen video and ajust area video(optional, but fast)
Press Button "Run Search"

Optional.- Manually delete images (explorer windows) that do not have text in folder C:\VideoSubFinder5x64\RGBImages
Run script ( Download )
Bash:
C:\Python29\python.exe easyOcrImage.py
Code:
easyOcrImage.py -l ch_tra -d "c:\youDirectoyImages"
At the end of the script, the text files(OCR) are generated in the folder TXTResults
And now just
Press the button "Create Sub From TXTResults" (save subtitle srt)

Python:
directoryDefault=r'C:\VideoSubFinder5x64\RGBImages'
extensions=[".jpg",".png",".jpeg",".bmp"]
languagesDefault="ch_tra"
import os
import argparse
def main():
parser = argparse.ArgumentParser(formatter_class=argparse.RawDescriptionHelpFormatter,description=r"easyOcrImage.py -l en,ch_tra -d " + directoryDefault,epilog=codeLanguages)
parser.add_argument('-l','--langs',dest="langs",default=languagesDefault,help="Separated by (,) \"en,ch_tra\" for mix langs english & Traditional Chinese")
parser.add_argument('-d','--directory',dest="directory", default=directoryDefault,help='directory help')
args = parser.parse_args()
if not os.path.isdir(args.directory):
print ("Not exists directory: " + args.directory )
return
parentDirectory = os.path.dirname(args.directory)
directoryTXTResults = os.path.join(parentDirectory, "TXTResults")
if os.path.isdir( directoryTXTResults ):
directoryTxt=directoryTXTResults
else:
directoryTxt=args.directory
os.system("title OCR for " + args.directory + " - " + args.langs)
import easyocr
reader = easyocr.Reader( args.langs.replace(" ","").split(",") )
files = [x for x in os.listdir(args.directory) if os.path.splitext(x)[1] in extensions]
for i,x in enumerate(files):
os.system("title OCR {}/{} Processed".format(i,len(files)) )
fileImage = os.path.join(args.directory,x)
fileTxt = os.path.join(directoryTxt,x)
result = reader.readtext(fileImage,detail=0, paragraph=True)
with open(fileTxt+".txt", "w", encoding="utf-8") as f:
f.write( " ".join(result) )
codeLanguages="""Languages
Code Name
--- ----
abq Abaza
ady Adyghe
af Afrikaans
ang Angika
ar Arabic
as Assamese
ava Avar
az Azerbaijani
be Belarusian
bg Bulgarian
bh Bihari
bho Bhojpuri
bn Bengali
bs Bosnian
ch_sim Simplified Chinese
ch_tra Traditional Chinese
che Chechen
cs Czech
cy Welsh
da Danish
dar Dargwa
de German
en English
es Spanish
et Estonian
fa Persian (Farsi)
fr French
ga Irish
gom Goan Konkani
hi Hindi
hr Croatian
hu Hungarian
id Indonesian
inh Ingush
is Icelandic
it Italian
ja Japanese
kbd Kabardian
kn Kannada
ko Korean
ku Kurdish
la Latin
lbe Lak
lez Lezghian
lt Lithuanian
lv Latvian
mah Magahi
mai Maithili
mi Maori
mn Mongolian
mr Marathi
ms Malay
mt Maltese
ne Nepali
new Newari
nl Dutch
no Norwegian
oc Occitan
pl Polish
pt Portuguese
ro Romanian
ru Russian
rs_cyrillic Serbian (cyrillic)
rs_latin Serbian (latin)
sck Nagpuri
sk Slovak (need revisit)
sl Slovenian
sq Albanian
sv Swedish
sw Swahili
ta Tamil
tab Tabassaran
te Telugu
th Thai
tl Tagalog
tr Turkish
ug Uyghur
uk Ukranian
ur Urdu
uz Uzbek
vi Vietnamese (need revisit)"""
if __name__ == "__main__":
main()
NEW NEW NEW February 2025
OCR WeChat
It offers OCR with automatic language detection for Chinese, Japanese and English (HORIZONTAL).
UMI-OCR
It has GUI and a variety of offline OCR engines, PaddleOCR, RapidOCR,Tesseract, wechatocr,
Installation
- Install wechat on your pc with windows 10
WeChat - Free messaging and calling app
Available for all kinds of platforms; enjoy group chat; support voice,photo,video and text messages.
Weixin for Windows
Available for all kinds of platforms; enjoy group chat; support voice,photo,video and text messages.
TEST IN 3.9.12.17 x64
- Install Python 3.xx

- Download PLUGIN (4 files).
1. wechatocr.py.-It can operate independently. only wcocr.pyd is needed in the same directory
2. wcocr.pyd. is needed (dll)
3. wechat_config.py
4. __init__.py
NOTE:
wcocr.pyd.- This only works in version 3, in the 4 you can stop working, the dll you need is wcocr.pyd, and this can be updated on the link:
https://github.com/swigger/wechat-ocr
- Install UMI-OCR (GUI)
C:\Program Files\Umi-OCR_v2.1.4\UmiOCR-data\plugins
Run Umi-OCR.exe, y change settings path

C:\Program Files\Tencent\WeChat\[3.9.12.17]
C:\Users\YOURNAME\AppData\Roaming\Tencent\WeChat\XPlugin\Plugins\WeChatOCR\7079\extracted\WeChatOCR.exe
The only important files are 4, and the directory containing the models (60 MB). It is recommended copied in single folder. And change the location, either at wechatocr.py if you only take care of this one, or/and also change in UMIOCR settings.
mmmojo.dll
mmmojo_64.dll
WeChatOCR.exe
x64.config
Model/
FPOCRRecog.xnet
OCRDetFP32.xnet.nas
OCRParaDetV1.1.0.26.xnet
OCRRecogFP32V1.1.0.26.xnet
RecDict
sohu_simp.txt
NOTE:
OTHER LINKS
GitHub - hiroi-sora/Umi-OCR_plugins: Umi-OCR 插件库
Umi-OCR 插件库. Contribute to hiroi-sora/Umi-OCR_plugins development by creating an account on GitHub.
Releases · eaeful/WechatOCR_umi_plugin
适用于 Umi-OCR 文字识别工具 的 WeChatOCR 插件,调用微信OCR进行本地离线ocr识别文字的插件。本项目有两个版本,一个是插件自带微信ocr文件的版本,另外一个是手动填写微信安装目录和wechatocr.exe的版本 - eaeful/WechatOCR_umi_plugin
This is another wechatOCR plugin, with the full OCR files, but the plugin is a little unstable, it is recommended that only the EXE, DLL and models are used.
New addition of a plugin forUMI-OCR, works with or without UMIOCR. (required wcocr.pyd)
Python:
import os
import base64
directoryDefault=r'C:\VideoSubFinder5x64\RGBImages'
pathlocal = {
"path": r"C:\Program Files\Tencent\WeChat\[3.9.12.17]",
# C:\Users\yourname\AppData\Roaming\Tencent\WeChat\XPlugin\Plugins\WeChatOCR\7079\extracted\WeChatOCR.exe
"pathOCR": os.path.join( os.getenv("APPDATA") , r"Tencent\WeChat\XPlugin\Plugins\WeChatOCR\7079\extracted\WeChatOCR.exe")
}
def main():
import argparse
parser = argparse.ArgumentParser(formatter_class=argparse.RawDescriptionHelpFormatter,description=r"wechatocr -d directory/image")
parser.add_argument('-d','--file_dir',default=directoryDefault, help='Directory or image')
args = parser.parse_args()
if os.path.isfile(args.file_dir):
files=[args.file_dir]
directory=os.path.dirname(args.file_dir)
elif os.path.isdir(args.file_dir):
directory=args.file_dir
files = [x for x in os.listdir(directory) if os.path.splitext(x)[1] in [".jpg",".png",".jpeg",".bmp"] ]
else:
return
parentDirectory = os.path.dirname(directory)
directoryTXTResults = os.path.join(parentDirectory, "TXTResults")
if os.path.isdir( directoryTXTResults ):
directoryTxt=directoryTXTResults
else:
directoryTxt=directory
os.system("title OCR for " + directory)
ocrApi=Api(pathlocal)
ocrApi.start()
print("Processing in: " + directory)
for i,x in enumerate(files):
os.system("title OCR {}/{} Processed {}".format(i,len(files),x) )
fileImage = os.path.join(directory,x)
fileTxt = os.path.join(directoryTxt,x)
ocrApi.runPath(fileImage)
results = ocrApi.runPath(fileImage)
# breakpoint()
text=[x['text'] for x in results["data"]]
with open(fileTxt+".txt", "w", encoding="utf-8") as f:
f.write( " ".join(text) )
def cxybox(x):
return [ [int(x["left"]),int(x["top"])], [int(x["right"]),int(x["top"])], [int(x["right"]),int(x["bottom"])], [int(x["left"]),int(x["bottom"])] ]
class Api:
def __init__(self, globalArgd):
self.lang = ""
self.path = globalArgd["path"]
self.pathOCR = globalArgd["pathOCR"]
def start(self, argd=None):
try:
from . import wcocr
except:
try:
import wcocr
except:
return "[Error] In Module wcocr (wcocr.pyd)"
self.wcocr=wcocr
if self.wcocr.init(self.pathOCR, self.path):
return ""
else:
err = "[Error] Initialization OCR (Not Found path)"
print(err)
return err
def stop(self):
self.wcocr=None
def runPath(self, imgPath: str):
return self._ocr(imgPath)
def runBytes(self, imageBytes):
imgPath="temp_image.png"
with open(imgPath, "wb") as f:
f.write(imageBytes)
return self._ocr(imgPath)
def runBase64(self, imageBase64):
try:
imageBytes = base64.b64decode(imageBase64)
return self.runBytes(imageBytes)
except Exception as e:
return {"code": 102, "data": f"[Error] Base64:{str(e)}"}
def _ocr(self,imgPath):
results=self.wcocr.ocr(imgPath)
if "ocr_response" in results:
res = {
"code": 100,
"data": [{"text":x['text'],"box":cxybox(x),"score":x['rate']} for x in results['ocr_response']],
}
else:
res = {"code": 102, "data": "Error in results"}
return res
if __name__ == "__main__":
main()
Reminder, Download link 28,000+ Subtitle pack! (2001-2021)
https://www.akiba-online.com/thread...not-a-sub-request-thread.1920331/post-4193115
Attachments
Last edited: