Python实现文字识别OCR

Python实现文字识别OCR可选的库很多，这里介绍了Tesseract、ddddocr、CnOCR、paddleocr等。

Tesseract

Tesseract是一个开源的ocr引擎，可以开箱即用，项目最初由惠普实验室支持，1996年被移植到Windows上，1998年进行了C++化。在2005年Tesseract由惠普公司宣布开源。2006年到现在，都由Google公司开发。

import pytesseract
from PIL import Image, ImageEnhance
"""
步骤①：定位图片的元素，并且截取当前浏览器的页面图片
步骤②：获取验证码坐标点，以及验证码图片、浏览器、截图的长和宽
步骤③：截取截图里的验证码图片，获得的验证码图片并保存
步骤④：获得验证码code
"""
# imagePng = "../img/test.png"
# 原图路径
imagePng = "../img/test.png"
# 处理之后图片的路径
savePngPath = "../img/savePng.png"# 原图转对象
resource_img = Image.open(imagePng)# 转换模式：L | RGB
resource_img = resource_img.convert('L')# 提高识别率
enhancer = ImageEnhance.Color(resource_img)
enhancer = enhancer.enhance(0)
enhancer = ImageEnhance.Brightness(enhancer)
enhancer = enhancer.enhance(2)
enhancer = ImageEnhance.Contrast(enhancer)      # 增强对比度
enhancer = enhancer.enhance(8)
enhancer = ImageEnhance.Sharpness(enhancer)
resource_img = enhancer.enhance(20)resource_img = ImageEnhance.Contrast(resource_img)  # 增强对比度
resource_img = resource_img.enhance(2.0)
resource_img.save(savePngPath)# 识别图片
code = pytesseract.image_to_string(Image.open(savePngPath)).strip()
#code = pytesseract.image_to_string(Image.open('../img/xin.png')).strip()
print(f"提取的文字为：{code}")

ddddocr

ddddocr（Deep Double-Digital Digits OCR）是一个基于深度学习的数字识别库，专门用于识别双重数字（双位数字）的任务。它是一个开源项目，提供了训练和预测的功能，可用于识别图片中的双位数字并输出其具体的数值。

pip install ddddocr

import ddddocrocr = ddddocr.DdddOcr(old=True)
with open('../img/test.png', 'rb') as f:img_bytes = f.read()
res = ocr.classification(img_bytes)
print('识别出的文字为：' + res)

import ddddocr# 初始化 OCR 引擎
ocr = ddddocr.DdddOcr()# 读取身份证图像
image_path = 'path_to_your_image.jpg'
image = ddddocr.imread(image_path)# 图像预处理
# TODO: 进行图像预处理操作，如裁剪、缩放、灰度转换等# 文字区域检测
text_boxes = ocr.detect(image)# 文字识别
results = []
for box in text_boxes:text = ocr.recognize(image, box)results.append(text)# 结果解析
# TODO: 对识别结果进行解析和后处理，提取身份证上的关键信息# 输出识别结果
for result in results:print(result)

CnOCR

CnOCR 是 Python 3 下的文字识别（Optical Character Recognition，简称OCR）工具包，支持简体中文、繁体中文（部分模型）、英文和数字的常见字符识别，支持竖排文字的识别。自带了20+个训练好的识别模型，适用于不同应用场景，安装后即可直接使用。同时，CnOCR也提供简单的训练命令供使用者训练自己的模型。欢迎加入交流群。

$ pip install cnocr[ort-cpu]

from cnocr import CnOcrimg_fp = './docs/examples/huochepiao.jpeg'
ocr = CnOcr()  # 所有参数都使用默认值
out = ocr.ocr(img_fp)print(out)

paddleocr

参考链接

https://github.com/tesseract-ocr/tesseract
https://github.com/sml2h3/ddddocr
https://aka.ms/vs/16/release/VC_redist.x86.exe
https://aka.ms/vs/16/release/VC_redist.x64.exe
https://cnocr.readthedocs.io/zh/latest/
https://github.com/PaddlePaddle/PaddleOCR