create_ocr_class_mlp ( : : WidthCharacter, HeightCharacter, Interpolation, Features, Characters, NumHidden, Preprocessing, NumComponents, RandSeed : OCRHandle )

Create an OCR classifier using a multilayer perceptron.

create_ocr_class_mlp creates an OCR classifier that uses a multilayer perceptron (MLP). The handle of the OCR classifier is returned in OCRHandle.

For a description on how an MLP works, see create_class_mlp. create_ocr_class_mlp creates an MLP with OutputFunction = 'softmax'. The number of input variables of the MLP (NumInput in create_class_mlp) is determined from the features that are used for the OCR, which are passed in Features. The features are described below. The number of units in the hidden layer is determined by NumHidden. The number of output variables of the MLP (NumOutput in create_class_mlp) is determined from the names of the characters to be used in the OCR, which are passed in Characters. As described with create_class_mlp, the parameters Preprocessing and NumComponents can be used to specify a preprocessing of the data (i.e., the feature vectors). The OCR already approximately normalizes the features. Hence, Preprocessing can typically be set to 'none'. The parameter RandSeed has the same meaning as in create_class_mlp.

The features to be used for the classification are determined by Features. Features can contain a tuple of several feature names. Each of these feature names results in one or more features to be calculated for the classifier. Some of the feature names compute gray value features (e.g., 'pixel_invar'). Because a classifier requires a constant number of features (input variables), a character to be classified is transformed to a standard size, which is determined by WidthCharacter and HeightCharacter. The interpolation to be used for the transformation is determined by Interpolation. It has the same meaning as in affine_trans_image. The interpolation should be chosen such that no aliasing effects occur in the transformation. For most applications, Interpolation = 'constant' should be used. It should be noted that the size of the transformed character is not chosen too large, because the generalization properties of the classifier may become bad for large sizes. In particular, large sizes will lead to the fact that small segmentation errors will have a large influence on the computed features if gray value features are used. This happens because segmentation errors will change the smallest enclosing rectangle of the regions, which leads to the fact that the character is zoomed differently than the characters in the training set. In most applications, sizes between 6x8 and 10x14 should be used.

The parameter Features can contain the following feature names for the classification of the characters. By specifying 'default', the features 'ratio' and 'pixel_invar' are selected.

  'pixel'
      Gray values of the character
      (WidthCharacterxHeightCharacter features).

  'pixel_invar'
      Gray values of the character with maximum scaling of the gray values
      (WidthCharacterxHeightCharacter features).

  'pixel_binary'
      Region of the character as a binary image zoomed to a size of
      WidthCharacterxHeightCharacter
      (WidthCharacterxHeightCharacter features).

  'projection_horizontal'
      Horizontal projection of the gray values (see
      gray_projections, HeightCharacter features).

  'projection_horizontal_invar'
      Maximally scaled horizontal projection of the gray values
      (HeightCharacter features).

  'projection_vertical'
      Vertical projection of the gray values (see
      gray_projections, WidthCharacter features).

  'projection_vertical_invar'
      Maximally scaled vertical projection of the gray values
      (WidthCharacter features).

  'ratio'
      Aspect ratio of the character (1 feature).

  'anisometry'
      Anisometry of the character (see eccentricity, 1
      feature).

  'width'
      Width of the character before scaling the character to the
      standard size (not scale-invariant, see
      smallest_rectangle1, 1 feature).

  'height'
      Height of the character before scaling the character to the
      standard size (not scale-invariant, see
      smallest_rectangle1, 1 feature).

  'zoom_factor'
      Difference in size between the character and the values of
      WidthCharacter and HeightCharacter (not
      scale-invariant, 1 feature).

  'foreground'
      Fraction of pixels in the foreground (1 feature).

  'foreground_grid_9'
      Fraction of pixels in the foreground in a 3x3 grid within the
      smallest enclosing rectangle of the character (9 features).

  'foreground_grid_16'
      Fraction of pixels in the foreground in a 4x4 grid within the
      smallest enclosing rectangle of the character (16 features).

  'compactness']
      Compactness of the character (see compactness, 1
      feature).

  'convexity'
      Convexity of the character (see convexity, 1
      feature).

  'moments_region_2nd_invar'
      Normalized 2nd moments of the character (see
      moments_region_2nd_invar, 3 features).

  'moments_region_2nd_rel_invar'
      Normalized 2nd relative moments of the character (see
      moments_region_2nd_rel_invar, 2 features).

  'moments_region_3rd_invar'
      Normalized 3rd moments of the character (see
      moments_region_3rd_invar, 4 features).

  'moments_central'
      Normalized central moments of the character (see
      moments_region_central, 4 features).

  'moments_gray_plane']
      Normalized gray value moments and the angle of the gray value
      plane (see moments_gray_plane, 4 features).

  'phi'
      Sinus and cosinus of the orientation (angle) of the character (see
      elliptic_axis, 2 feature).

  'num_connect'
      Number of connected components (see connect_and_holes,
      1 feature).

  'num_holes'
      Number of holes (see connect_and_holes, 1 feature).

  'cooc'
      Values of the binary cooccurrence matrix (see
      gen_cooc_matrix, 8 features).

  'num_runs'
      Number of runs in the region normalized by the area (1
      feature).

  'chord_histo'
      Frequency of the runs per row (HeightCharacter
      features).

After the classifier has been created, it typically is trained using trainf_ocr_class_mlp. After this, the classifier can be saved using write_ocr_class_mlp. Alternatively, the classifier can be used immediately after training to classify characters using do_ocr_single_class_mlp or do_ocr_multi_class_mlp.

HALCON provides a number of pretrained OCR classifiers (see Quick Guide, chapter 'OCR', section 'Pretrained OCR Fonts'). These pretrained OCR classifiers can be read directly with read_ocr_class_mlp and make it possible to read a wide variety of different fonts without the need to train an OCR classifier. Therefore, it is recommended to try if one of the pretrained OCR classifiers can be used successfully. If this is the case, it is not necessary to create and train an OCR classifier.


Parameters

WidthCharacter (input_control)
integer -> integer
Width of the rectangle to which the gray values of the segmented character are zoomed.
Default value: 8
Suggested values: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 20
Typical range of values: 4 <= WidthCharacter <= 20

HeightCharacter (input_control)
integer -> integer
Height of the rectangle to which the gray values of the segmented character are zoomed.
Default value: 10
Suggested values: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 20
Typical range of values: 4 <= HeightCharacter <= 20

Interpolation (input_control)
string -> string
Interpolation mode for the zooming of the characters.
Default value: 'constant'
List of values: 'none', 'constant', 'weighted'

Features (input_control)
string(-array) -> string
Features to be used for classification.
Default value: 'default'
List of values: 'default', 'pixel', 'pixel_invar', 'pixel_binary', 'projection_horizontal', 'projection_horizontal_invar', 'projection_vertical', 'projection_vertical_invar', 'ratio', 'anisometry', 'width', 'height', 'zoom_factor', 'foreground', 'foreground_grid_9', 'foreground_grid_16', 'compactness', 'convexity', 'moments_region_2nd_invar', 'moments_region_2nd_rel_invar', 'moments_region_3rd_invar', 'moments_central', 'moments_gray_plane', 'phi', 'num_connect', 'num_holes', 'cooc', 'num_runs', 'chord_histo'

Characters (input_control)
string-array -> string
All characters of the character set to be read.
Default value: '['0','1','2','3','4','5','6','7','8','9']'

NumHidden (input_control)
integer -> integer
Number of hidden units of the MLP.
Default value: 80
Suggested values: 1, 2, 3, 4, 5, 8, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100, 120, 150
Restriction: NumHidden >= 1

Preprocessing (input_control)
string -> string
Type of preprocessing used to transform the feature vectors.
Default value: 'none'
List of values: 'none', 'normalization', 'principal_components', 'canonical_variates'

NumComponents (input_control)
integer -> integer
Preprocessing parameter: Number of transformed features (ignored for Preprocessing = 'none' and Preprocessing = 'normalization').
Default value: 10
Suggested values: 1, 2, 3, 4, 5, 8, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100
Restriction: NumComponents >= 1

RandSeed (input_control)
integer -> integer
Seed value of the random number generator that is used to initialize the MLP with random values.
Default value: 42

OCRHandle (output_control)
ocr_mlp -> integer
Handle of the OCR classifier.


Example
read_image (Image, 'letters')
* Segment the image.
bin_threshold (Image, Region)
dilation_circle (Region, RegionDilation, 3.5)
connection (RegionDilation, ConnectedRegions)
intersection (ConnectedRegions, Region, RegionIntersection)
sort_region (RegionIntersection, Characters, 'character', 'true', 'row')
* Generate the training file.
Number := |Characters|
Classes := []
for J := 0 to 25 by 1
    Classes := [Classes,gen_tuple_const(20,chr(ord('a')+J))]
endfor
Classes := [Classes,gen_tuple_const(20,'.')]
write_ocr_trainf (Characters, Image, Classes, 'letters.trf')
* Generate and train the classifier.
read_ocr_trainf_names ('letters.trf', CharacterNames, CharacterCount)
create_ocr_class_mlp (8, 10, 'constant', 'default', CharacterNames, 20,
                      'none', 81, 42, OCRHandle)
trainf_ocr_class_mlp (OCRHandle, 'letters.trf', 100, 0.01, 0.01, Error,
                      ErrorLog)
* Re-classify the the characters in the image.
do_ocr_multi_class_mlp (Characters, Image, OCRHandle, Class, Confidence)
clear_ocr_class_mlp (OCRHandle)

Result

If the parameters are valid, the operator create_ocr_class_mlp returns the value 2 (H_MSG_TRUE). If necessary an exception handling is raised.


Parallelization Information

create_ocr_class_mlp is processed completely exclusively without parallelization.


Possible Successors

trainf_ocr_class_mlp


Alternatives

create_ocr_class_box


See also

do_ocr_single_class_mlp, do_ocr_multi_class_mlp, clear_ocr_class_mlp, create_class_mlp, train_class_mlp, classify_class_mlp


Module

OCR/OCV



Copyright © 1996-2008 MVTec Software GmbH