废话不多说,上代码
from PIL import Image
from pyocr import tesseract
tesseract.TESSERACT_CMD = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
print(tesseract.image_to_string(Image.open('test.png')))
解决思路:安装完成tesseract-ocr,之后心里就有一个疑问,pyocr库如何找到tesseract-ocr引擎并调用,果然出错!!那么pyocr的tesseract库肯定有相应的类似环境变量的属性,于是
dir(tesseract)
返回结果如下:
[‘CharBoxBuilder’, ‘DigitBuilder’, ‘ReOpenableTempfile’, ‘TESSDATA_EXTENSION’, ‘TESSERACT_CMD’, ‘TesseractError’, ‘all’, ‘builtins’, ‘cached’, ‘doc’, ‘file’, ‘loader’, ‘name’, ‘package’, ‘spec’, ‘_set_environment’, ‘builders’, ‘can_detect_orientation’, ‘cleanup’, ‘codecs’, ‘detect_orientation’, ‘digits_only’, ‘g_creation_flags’, ‘g_subprocess_startup_info’, ‘g_version’, ‘get_available_builders’, ‘get_available_languages’, ‘get_name’, ‘get_version’, ‘image_to_string’, ‘is_available’, ‘logger’, ‘logging’, ‘os’, ‘psm_parameter’, ‘run_tesseract’, ‘shutil’, ‘subprocess’, ‘sys’, ‘tempfile’]
找到相应属性