当前位置: 首页 > 工具软件 > deep_ocr > 使用案例 >

halcon OCR字符识别,部分参数注释

后源
2023-12-01

 

Halcon ocr字符识别 主要函数

create_text_model_reader( : : ModeOCRClassifier : TextModel)

创建一个读入模型,描述文本分割方式

Mode表示文本分割方法,包括“auto”和“manual”需要有很强的局部极性变化的文本分割。例如,由于反射,雕刻的文字经常有很强的局部变化。

没有合适的OCR分类器可用

Parameters of text models with Mode = 'auto'

Segmentation behavoir

"min_contrast": The minimal contrastthe characters have to their surrounding background.

List of values: integer or float valuebetween 1 and 255 for byte images and between 1 and 65.535 for uint2 images

Default value: 15最小灰度值差

"polarity":"dark_on_light" ifthe text to be segmented is darker than itsbackground, "light_on_dark" if the text to be segmented islighter than its background, and "both" if both kinds oftext are to be segmented.

List ofvalues: "dark_on_light", "light_on_dark", "both"

Default value: "both" 极性,亮字暗背景或暗字亮背景

"eliminate_border_blobs":"true" ifregions that are touching the border of the image domain should be discarded, otherwise "false".

List ofvalues: "true","false"

Default value: "false" 排除接触的边界blob

"add_fragments":"true" iffragments, such as the dot on the 'i', should be added to the segmentedcharacters, otherwise 'false'. Be aware, that this can cause noise to be addedto the segmented characters.

List ofvalues: "true","false"

Default value: "true" 是否有额外的碎片,如i

Character size

"min_char_height":

The minimal height of the characters in pixel.If text of arbitrary height is to be segmented, "auto" maybe passed. Note that "min_char_height" refers to charactersonly. The height of punctuation marks or separators is not restrictedby "min_char_height".

List of values: integer or float valuegreater or equal to 1

Default value: "auto"字符最小高度

"max_char_height":

The maximal height of the characters in pixel.If text of arbitrary height is to be segmented, "auto" maybe passed. Note that "max_char_height" refers to charactersonly. The height of punctuation marks or separators is not restrictedby "max_char_height".

List of values: integer or float valuegreater or equal to 1

Default value: "auto"字符最大高度

"min_char_width":

The minimal width of the characters in pixel.If text of arbitrary width is to be segmented, "auto" maybe passed. Note that "min_char_width" refers to charactersonly. The width of punctuation marks or separators is not restrictedby "min_char_width".

List of values: integer or float valuegreater or equal to 1

Default value: "auto"字符最小宽度

"max_char_width":

The maximal width of the characters in pixel.If text of arbitrary width is to be segmented, "auto" maybe passed. Note that "max_char_width" refers to charactersonly. The width of punctuation marks or separators is not restrictedby "max_char_width".

List of values: integer or float valuegreater or equal to 1

Default value: "auto"字符最大宽度

"min_stroke_width":

The minimal stroke width of the characters inpixel. If the minimal stroke width is to be estimated within the textsegmentation process automatically, "auto" may be passed.Note that "min_stroke_width" refers to characters only. Thestroke width of punctuation marks or separators is not restrictedby "min_stroke_width".

List of values: integer or float valuegreater or equal to 1

Default value: "auto"笔画最小宽度

"max_stroke_width":

The maximal stroke width of the characters inpixel. If the maximal stroke width is to be estimated within the textsegmentation process automatically, "auto" may be passed.Note that "max_stroke_width" refers to characters only. Thestroke width of punctuation marks or separators is not restrictedby "max_stroke_width".

List of values: integer or float valuegreater or equal to 1

Default value: "auto"笔画最大宽度

Special characters

"return_punctuation":"true" ifsmall punctuation marks that lie close to the base line of the corresponding textline (e.g. dots or commas) are to be returned. "false" ifno such punctuations should be returned.

List ofvalues: "true","false"

Default value: "true" 返回标点

"return_separators":"true" ifseparators such as a minus or the equality sign should be returned aswell. "false" if no separators should be returned.

List ofvalues: "true","false"

Default value: "true" 返回分隔符

Handling of dot prints

"dot_print":"true" ifthe text to be segmented contains dot printed characters,otherwise "false".

List of values: "true","false"

Default value: "false" 是否是点印

"dot_print_tight_char_spacing":"true" ifthe gap between adjacent characters is smaller than the largest gap between twodots within a single character, otherwise "false".If "dot_print" is set to "false" thisparameter does not have any effect. In cases where the minimal gap size betweencharacters is exactly known, "dot_print_min_char_gap" canbe set instead. In this case the valueof "dot_print_tight_char_spacing" is ignored.

List ofvalues: "true","false"

Default value: "false" 点印字符间距是否小于点间距

"dot_print_min_char_gap":The minimalgap size between two characters in pixel. This parameter can be used to improvethe text result in cases where the minimal gap size between characters issmaller than the maximal gap size between dots within characters. If theminimal character gap size is not known or is bigger than the maximal dot gapsize, "auto" may be passed.If "dot_print" is set to "false" thisparameter does not have any effect. In cases where the minimal gap size betweencharacters is not known but the characters are printed close to eachother, "dot_print_tight_char_spacing" might be usedinstead.

Here, the minimal gap size between charactersis 8 pixel.

List of values: integer or float valuegreater or equal to 0

Default value: "auto" 最小间距

"dot_print_max_dot_gap":

The maximal gap size between two dots within acharacter in pixel. If arbitrary dot printed characters are to besegmented, "auto" may be passed.If "dot_print" is set to "false" thisparameter does not have any effect. In cases where the maximal dot gap size islarger than or equal to the minimal gap size betweencharacters, "dot_print_tight_char_spacing" or "dot_print_min_char_gap" shouldbe set accordingly. Setting"dot_print_max_dot_gap" can reducethe runtime of FindText significantly.

Here, the maximal gap size between dots is 3pixel.

List of values: integer or float valuegreater or equal to 1

Default value: "auto"最大间距

Line structures

"text_line_structure":To simplify thesearch for specific structures (e.g. dates or serial numbers) within thesegmented text, it is possible to define text line structures. For each textline the distances between the characters are calculated, and based on thesedistances, the text line is divided into text blocks. Short characters such as'.', '_' and '-' are ignored in this process and treated as spaces.Furthermore, it is possible to define user specific separators which are alsoignored. See the description of "text_line_separators" fordetails. It is then tested if any of the user defined text line structures fitthe resulting text blocks.

For example, if the text to be found is a datewith two characters for month, day, and year the structure would be '2 2 2'. Ifthe year may consist of two or four characters, the structure would be '2 22-4', indicating that the last character block consists of two to fourcharacters. It is possible to provide more than one structure to match byappending an index to the parameter name, e.g. 'text_line_structure_0','text_line_structure_1'. If"text_line_structure" is set to anempty string ' ', the text to be found may have any structure.

Please observe, that every text line structurewhich is found, is saved as a unique text line within the text result. Hence,when calling GetTextObject,a 'line' then refers to a valid text line structure. If the whole text linecontaining the text line structure is to be returned instead, it is possible toset "return_whole_line" accordingly.

Default value: ' '

"text_line_separators":A stringcontaining the list of characters which are to be ignored in the process offinding text line structures, see "text_line_structure" forfurther details. Please note, user specific separators need to be validcharacters within the used OCR classifier. For example,if ":" and "\" are to beignored, ":\\" should be passed. Please observe,that "\" escapes any special symbol to treat it as aliteral, and hence "\\" needs to be passed touse "\" as a separator.

List ofvalues: "/",":", ":\\" , "\\/:" ,...

Default value: ' '

"return_whole_line":"false" ifonly the segmented text line structures are to be returned as textlines. "true" if each whole text line containing a textline structure is to be returned in text lines.

List ofvalues: "true","false"

Default value: "false"

 类似资料: