Tesseract library

Overview

Overview

You can import target images as text strings or tables using the Tesseract OCR engine provided by the Tesseract library.

The features provided by the Tesseract library are as follows:

Use Designer or File Explorer to refer to the samples of the Tesseract library.

• On Designer: > Help > Sample > Sample > Tesseract

• On File Explorer: C:\Users\user\AppData\Roaming\Brity RPA Designer\samples\Tesseract

Installing the Tesseract library

For Tesseract library installation, please refer to Installing the Brity RPA Add-In Library.

Close both Brity RPA Designer and Bot before installation.

Common Properties

Common Properties

IMAGE

Card properties

Properties

Type

Required

Unit

Auto-setting

Description

Image

image

Y

-

Y

View captured images.

Bounds

String

N

-

Y

Shows the position and size of the target UI object specified by the user as relative coordinates based on the screen. (Unit: Pixel)

X: Horizontal Y: Vertical W: Width H: Height

Ex.) '0,0,100,100'

You can also specify the coordinates with a variable.

OCR

Card properties

Properties

Type

Required

Unit

Auto-setting

Description

Output

Variable

Y

-

Y

Specify the variable to store the result of the OCR operation.

Language

Combination box

Y

-

Y

Specify the language of the document to read.

All languages included in the target document must be selected.

FilePath

String

Y

-

Y

Specify the target file path for the OCR operation.

Only “.png,” “.jpg,” and “.bmp” file formats can be used.

MORE OPTIONS

Card properties

Properties

Type

Required

Unit

Auto-setting

Description

OCR types

Combination box

Y

-

Y

Specify the OCR operation type.

- Horizontal: Read from left to right.

- Vertical: Read from top to bottom.

- Single Line: Read the entire image as a single line.

GetRawData

Toggle button

Y

-

Y

Specify whether to return the OCR result in the XML format.

Draw Bounds

Toggle button

N

-

Y

Specify whether to draw a border around the target area.

On Error

Combination box

N

-

N

Specify an action to carry out when an error occurs at the time of execution.

- If not specified: Output errors and exit the task.

- --Ignore--: Ignore the error.

- --Retry--: Try the activity one more time.

- --GoTo--: Try the scenario for the specified time if the activity fails.

- _Event: Select an event created within the project.

DESCRIPTION

Card properties

Properties

Type

Required

Unit

Auto-setting

Description

DESCRIPTION

Text

N

-

N

Enter the description for the activity card.

The description entered in the DESCRIPTION field is used as the description of the activity.

A representative value will be displayed in the absence of the DESCRIPTION input.

When the GetRawData property is set as “True,” the XML structure is as follows:

GetImageText

GetImageText

Overview

This activity card converts characters on an image into a text.

Application procedures

  1. Double-click No Target on the activity card.

  2. Specify the range of the image to extract the text.

  3. In the [Output] Output field, enter the variable to store the text extracted from the specified boundary of the image.

  4. In the Properties window, specify other required properties.

Card output properties

Property

Type

Additional comments

Description

Example

Output

String

-

Converts the characters in the image to text and returns it.

'Google'

Card input properties

Card properties

Common

Properties

Type

Required

Unit

Auto-setting

Description

Y

IMAGE

-

-

-

-

Common property of the Tesseract library.

Y

OCR

-

-

-

-

Common property of the Tesseract library.

Y

MORE OPTIONS

-

-

-

-

Common property of the Tesseract library.

Y

DESCRIPTION

-

-

-

-

Common property of the Tesseract library.

Example of utilization

Sample file

GetImageTextInfo

GetImageTextInfo

Overview

This activity card recognizes the text on a specified image and returns it with the image information, such as the width and height.

Application procedures

  1. Double-click No Target on the activity card.

  2. Specify the range of the image to extract the text.

  3. In the [Output] Output field, enter the variable to store the image information fetched from the specified boundary of the image.

  4. In the Properties window, specify other required properties.

card output properties

Property

Type

Additional comments

Description

Example

Output

Object or String

-

Recognizes characters in an image and returns image information (height, width, etc.) and text.

'Bounds(0,0,270,89)-Text(Google)-WC(0.96)'

Card input properties

Card properties

Common

Properties

Type

Required

Unit

Auto-setting

Description

Y

IMAGE

-

-

-

-

Common property of the Tesseract library.

Y

OCR

-

-

-

-

Common property of the Tesseract library.

Y

MORE OPTIONS

-

-

-

-

Common property of the Tesseract library.

Y

DESCRIPTION

-

-

-

-

Common property of the Tesseract library.

Example of utilization

Sample file

The text information output of the sample file


Bounds: Boundary area of the text on the screen

WC: Word of Confidence (word accuracy)

SP: Space (space and width [pixel])

• Result information

RESULT.PrintSpace.ComposedBlockList[0].TextBlockList[0].GetTextBlock() -> Gmail

RESULT.PrintSpace.ComposedBlockList[0].TextBlockList[0].hPos -> 1678

RESULT.PrintSpace.ComposedBlockList[0].TextBlockList[0].vPos -> 25

RESULT.PrintSpace.ComposedBlockList[0].TextBlockList[0].width -> 32

RESULT.PrintSpace.ComposedBlockList[0].TextBlockList[0].height -> 10


Bounds(1678,25,32,10)-Text(Gmail)-WC(0.94)

Bounds(1727,26,36,12)-Text(Olga))-WC(0.9)

Bounds(824,198,270,89)-Text(Google)-WC(0.96)

Bounds(845,406,44,13)-Text(Google)-WC(0.96)-SP(19)

Bounds(908,404,12,13)-Text(B44)-WC(0.22)

Bounds(968,406,17,10)-Text(I'm)-WC(0.74)-SP(5)

Bounds(990,406,45,13)-Text(Feeling)-WC(0.95)-SP(5)

Bounds(1040,406,35,13)-Text(Lucky)-WC(0.96)

Bounds(32,891,57,14)-Text(cystels)-WC(0.0)

Bounds(31,935,26,13)-Text(aa)-WC(0.66)

Bounds(87,935,53,13)-Text(HIZUx)-WC(0.0)

Bounds(169,937,44,13)-Text(Google)-WC(0.96)-SP(5)

Bounds(218,935,26,13)-Text(3)-WC(0.29)

Bounds(273,935,36,13)-Text(Bxe|)-WC(0.0)-SP(10)

Bounds(319,935,25,13)-Text(aay)-WC(0.25)

Bounds(1669,935,109,13)-Text(MASA)-WC(0.41)-SP(-38)

Bounds(1740,930,9,25)-Text(Al)-WC(0.42)-SP(5)

Bounds(1754,930,28,25)-Text(etal)-WC(0.24)

Bounds(1808,935,27,13)-Text(oat)-WC(0.29)

Bounds(1863,935,25,13)-Text(aa)-WC(0.47)

GetImageTable

GetImageTable

Overview

This activity card recognizes a table on a specified image and fetches the text in the table.

Application procedures

  1. Double-click No Target on the activity card.

  2. Specify the range of the image to extract the text.

  3. In the [Output] Output field, enter the variable to store the data fetched from the table boundary of the image.

  4. If there are two or more lines of text in one cell, enter a list of rectangular areas detected through Hough transformation into the Rect List. If the extracted text is in the same rectangle, it is merged into one cell.

  5. In the Properties window, specify other required properties.

Card output properties

카드 속성

속성

리턴타입

리턴 추가설명

설명

Output

Object

Triple list

Recognizes a table in an image and returns the text information as a triplet list.

Table List -> Row List -> Column List

For details, see the Example of utilization below.

Card input properties

Card properties

Common

Properties

Type

Required

Unit

Auto-setting

Description

Y

IMAGE

-

-

-

-

Common property of the Tesseract library.

Y

OCR

-

-

-

-

Common property of the Tesseract library.

Y

MORE OPTIONS

-

-

-

-

Common property of the Tesseract library.

N

Rect List

Variables

N

-

N

List of rectangular areas detected through Hough transform

Y

DESCRIPTION

-

-

-

-

Common property of the Tesseract library.

Example of utilization

Sample file

The table information output of the sample file


Bounds: Boundary area of the text on the screen

WC: Word of Confidence (word accuracy)

SP: Space (space and width [pixel])


• Result information

this.readTables.TableList(0)(0)(0) -> Company

this.readTables.TableList(0)(0)(1) -> Contact

this.readTables.TableList(0)(1)(0) -> Alfreds Futterkiste

this.readTables.TableList(0)(2)(2) -> Mexico

0: Company | Contact | Country

1: Alfreds Futterkiste | Maria Anders | Germany

2: Centro comercial Moctezuma | Francisco Chang | Mexico

3: Emst Handel | Roland Mendel | Austria

4: Island Trading | Helen Bennett | UK

5: Laughing Bacchus Winecellars | Yoshi Tannamuri | Canada

6: Magazzini Alimentari Riuniti | Giovanni Rovelli | Italy

GetTextOnFile

GetTextOnFile

Overview

This activity card recognizes text on a specific image file through OCR and converts the data into text strings.

Application procedures

  1. In the [Output] Output field, enter the variable to store the text fetched through OCR.

  2. In the FilePath field, enter the path and name of the file to fetch the text from.

  3. In the Properties window, specify other required properties.

Card output properties

Property

Type

Additional comments

Description

Example

Output

string

-

Returns the characters in an image file as text.

'Google'

Card input properties

Card properties

Common

Properties

Type

Required

Unit

Auto-setting

Description

Y

IMAGE

-

-

-

-

Common property of the Tesseract library.

Y

OCR

-

-

-

-

Common property of the Tesseract library.

Y

MORE OPTIONS

-

-

-

-

Common property of the Tesseract library.

Y

DESCRIPTION

-

-

-

-

Common property of the Tesseract library.

Example of utilization

Sample file

GetTextInfoOnFile

GetTextInfoOnFile

Overview

This activity card returns text data and image information, such as the position, width, and height, from a specific image file through OCR.

Application procedures

  1. In the [Output] Output field, enter the variable to store the text data and image information fetched through OCR.

  2. In the FilePath field, enter the path and name of the file to fetch the text and image information from.

  3. In the Properties window, specify other required properties.

Card output properties

Property

Type

Additional comments

Description

Example

Output

Object or String

-

Recognizes characters in an image and returns image information (height, width, etc.) and text.

'Bounds(0,0,270,89)-Text(Google)-WC(0.96)'

Card input properties

Card properties

Common

Properties

Type

Required

Unit

Auto-setting

Description

Y

IMAGE

-

-

-

-

Common property of the Tesseract library.

Y

OCR

-

-

-

-

Common property of the Tesseract library.

Y

MORE OPTIONS

-

-

-

-

Common property of the Tesseract library.

Y

DESCRIPTION

-

-

-

-

Common property of the Tesseract library.

Example of utilization

Sample file

The table information output of the sample file


Bounds : Boundary area of the text on the screen

WC : Word of Confidence (word accuracy)

SP : Space (space and width [pixel])


• Result information

RESULT.PrintSpace.ComposedBlockList[0].TextBlockList[0].GetTextBlock() -> INVOICE

RESULT.PrintSpace.ComposedBlockList[0].TextBlockList[0].hPos -> 509

RESULT.PrintSpace.ComposedBlockList[0].TextBlockList[0].vPos -> 56

RESULT.PrintSpace.ComposedBlockList[0].TextBlockList[0].width -> 237

RESULT.PrintSpace.ComposedBlockList[0].TextBlockList[0].height -> 39


Bounds(509,56,237,39)-Text(INVOICE)-WC(0.96)

Bounds(62,185,52,21)-Text(Date)-WC(0.93)-SP(10)

Bounds(124,192,4,14)-Text(:)-WC(0.78)-SP(10)

Bounds(138,185,127,21)-Text(2019.12.03)-WC(0.85)

Bounds(917,185,129,21)-Text(SAMSUNG)-WC(0.96)-SP(9)

Bounds(1055,185,51,21)-Text(SDS)-WC(0.95)-SP(9)

Bounds(1115,186,75,20)-Text(CORP.)-WC(0.96)

Bounds(62,216,95,26)-Text(Request)-WC(0.96)-SP(8)

Bounds(165,216,91,21)-Text(Number)-WC(0.93)-SP(8)

Bounds(264,223,4,13)-Text(:)-WC(0.92)-SP(9)

Bounds(277,216,99,20)-Text(#100000)-WC(0.96)

Bounds(148,338,66,22)-Text(DATE)-WC(0.95)

Bounds(440,338,60,22)-Text(ITEM)-WC(0.94)

Bounds(659,338,113,22)-Text(AMOUNT)-WC(0.96)

Bounds(917,338,175,22)-Text(DESCRIPTION)-WC(0.96)

Bounds(118,412,126,20)-Text(2019.11.29)-WC(0.95)

Bounds(429,412,84,21)-Text(TEMA)-WC(0.11)

Bounds(703,412,27,22)-Text(10)-WC(0.96)

Bounds(958,412,72,22)-Text(DESC)-WC(0.79)-SP(8)

Bounds(1038,412,9,21)-Text(1)-WC(0.79)

Bounds(118,485,126,20)-Text(2019.11.30)-WC(0.96)

Bounds(428,486,84,20)-Text(ITEMB)-WC(0.90)

Bounds(701,491,30,19)-Text(20)-WC(0.96)

Bounds(958,485,92,21)-Text(DESC)-WC(0.86)-SP(-13)

Bounds(1037,481,16,37)-Text(2)-WC(0.86)

Bounds(118,559,124,21)-Text(2019.12.01)-WC(0.96)

Bounds(427,559,87,21)-Text(ITEMC)-WC(0.67)

Bounds(958,559,72,21)-Text(DESC)-WC(0.41)-SP(7)

Bounds(1037,559,13,21)-Text(3)-WC(0.41)

Bounds(700,567,29,18)-Text(30)-WC(0.96)

Bounds(117,632,128,20)-Text(2019.12.02)-WC(0.96)

Bounds(427,632,87,22)-Text(ITEMD)-WC(0.67)

Bounds(700,635,31,18)-Text(40)-WC(0.96)

Bounds(958,632,72,22)-Text(DESC)-WC(0.82)-SP(6)

Bounds(1036,632,14,22)-Text(4)-WC(0.82)Output

GetTableOnFile

GetTableOnFile

Overview

This activity card recognizes tables on a specific image file through OCR and returns the data to a two-dimensional array variable.

Application procedures

  1. In the [Output] Output field, enter the variable to store the text extracted from the table in the file.

  2. In the FilePath field, enter the path and name of the file to fetch the table text from.

  3. In the Properties window, specify other required properties.

Card output properties

Property

Type

Additional comments

Description

Example

Output

2D Array

String

Recognizes a table in an image and returns the text information as a two-dimensional array.

0: A | B | C

1: 1 | 2 | 3

2: 4 | 5 | 6

Card input properties

Card properties

Common

Properties

Type

Required

Unit

Auto-setting

Description

Y

IMAGE

-

-

-

-

Common property of the Tesseract library.

Y

OCR

-

-

-

-

Common property of the Tesseract library.

Y

MORE OPTIONS

-

-

-

-

Common property of the Tesseract library.

Y

DESCRIPTION

-

-

-

-

Common property of the Tesseract library.

Example of utilization

Sample file

TextClick

TextClick

Overview

This activity card finds a specific text from an image and clicks it.

Application procedures

  1. Double-click No Target on the activity card.

  2. Specify the range of the target image to click.

  3. In the [Output] Output field, enter the variable to store the execution result.

  4. In the OCR properties, enter the scale of the original image size and the text to search.

  5. In the Properties window, specify other required properties.

Card output properties

Property

Type

Additional comments

Description

Example

Output

String

-

The location and size (horizontal, vertical, width, height) of the target UI object retrieved from the Tesseract engine.

7,11,47,14

Card input properties

Card properties

Common

Properties

Type

Required

Unit

Auto-setting

Description

Y

IMAGE

-

-

-

-

Common property of the Tesseract library.

Y

OCR

-

-

-

-

Common property of the Tesseract library.

N

Target Text

String

Y

-

-

Specify the target to search for and click.

N

Index

Number

N

-

-

Enter the number to choose a text string if there are multiple instances of the same text strings.

Y

MORE OPTIONS

-

-

-

-

Common property of the Tesseract library.

Y

DESCRIPTION

-

-

-

-

Common property of the Tesseract library.

Example of utilization

Sample file

TextDoubleClick

TextDoubleClick

Overview

This activity card finds a specific text from an image and double-clicks it.

Application procedures

  1. Double-click No Target on the activity card.

  2. Specify the range of the target image to double-click.

  3. In the [Output] Output field, enter the variable to store the execution result.

  4. In the OCR properties, enter the scale of the original image size and the text to search.

  5. In the Properties window, specify other required properties.

Card output properties

Property

Type

Additional comments

Description

Example

Output

String

-

The location and size (horizontal, vertical, width, height) of the target UI object retrieved from the Tesseract engine.

7,11,47,14

Card input properties

Card properties

Common

Properties

Type

Required

Unit

Auto-setting

Description

Y

IMAGE

-

-

-

-

Common property of the Tesseract library.

Y

OCR

-

-

-

-

Common property of the Tesseract library.

N

Target Text

String

Y

-

-

Specify the target to search for and click.

N

Index

Number

N

-

-

Enter the number to choose a text string if there are multiple instances of the same text strings.

Y

MORE OPTIONS

-

-

-

-

Common property of the Tesseract library.

Y

DESCRIPTION

-

-

-

-

Common property of the Tesseract library.

Example of utilization

Sample file

TextRightClick

TextRightClick

Overview

This activity card finds a specific text from an image and right-clicks it.

Application procedures

  1. Double-click No Target on the activity card.

  2. Specify the range of the target image to right-click.

  3. In the [Output] Output field, enter the variable to store the execution result.

  4. In the OCR properties, enter the scale of the original image size and the text to search.

  5. In the Properties window, specify other required properties.

Card output properties

Property

Type

Additional comments

Description

Example

Output

String

-

The location and size (horizontal, vertical, width, height) of the target UI object retrieved from the Tesseract engine.

7,11,47,14

Card input properties

Card properties

Common

Properties

Type

Required

Unit

Auto-setting

Description

Y

IMAGE

-

-

-

-

Common property of the Tesseract library.

Y

OCR

-

-

-

-

Common property of the Tesseract library.

N

Target Text

String

Y

-

-

Specify the target to search for and click.

N

Index

Number

N

-

-

Enter the number to choose a text string if there are multiple instances of the same text strings.

Y

MORE OPTIONS

-

-

-

-

Common property of the Tesseract library.

Y

DESCRIPTION

-

-

-

-

Common property of the Tesseract library.

Example of utilization

Sample file

TextHover

TextHover

Overview

This activity card finds a specific text from an image and hovers the mouse pointer over it.

Application procedures

  1. Double-click No Target on the activity card.

  2. Specify the range of the target object over which to hover the mouse pointer.

  3. In the [Output] Output field, enter the variable to store the execution result.

  4. In the OCR properties, enter the scale of the original image size and the text to search.

  5. In the Properties window, specify other required properties.

Card output properties

Property

Type

Additional comments

Description

Example

Output

String

-

The location and size (horizontal, vertical, width, height) of the target UI object retrieved from the Tesseract engine.

7,11,47,14

Card input properties

Card properties

Common

Properties

Type

Required

Unit

Auto-setting

Description

Y

IMAGE

-

-

-

-

Common property of the Tesseract library.

Y

OCR

-

-

-

-

Common property of the Tesseract library.

N

Target Text

String

Y

-

-

Specify the target to search for and click.

N

Index

Number

N

-

-

Enter the number to choose a text string if there are multiple instances of the same text strings.

Y

MORE OPTIONS

-

-

-

-

Common property of the Tesseract library.

Y

DESCRIPTION

-

-

-

-

Common property of the Tesseract library.

Example of utilization

Sample file

GetTextBound

GetTextBound

Overview

This activity card finds a specific text from an image and fetches the coordinates data of the relevant area.

Application procedures

  1. Double-click No Target on the activity card.

  2. Specify the area of the image to fetch the coordinates data.

  3. In the [Output] Output field, enter the variable to store the execution result.

  4. In the OCR properties, enter the scale of the original image size and the text to search.

  5. In the Properties window, specify other required properties.

Card output properties

Property

Type

Additional comments

Description

Example

Output

String

-

The location and size (horizontal, vertical, width, height) of the target UI object retrieved from the Tesseract engine.

7,11,47,14

Card input properties

Card properties

Common

Properties

Type

Required

Unit

Auto-setting

Description

Y

IMAGE

-

-

-

-

Common property of the Tesseract library.

Y

OCR

-

-

-

-

Common property of the Tesseract library.

N

Target Text

String

Y

-

-

Specify the target to search for and fetch the area.

N

Index

Number

N

-

-

Enter the number to choose a text string if there are multiple instances of the same text strings.

Y

MORE OPTIONS

-

-

-

-

Common property of the Tesseract library.

Y

DESCRIPTION

-

-

-

-

Common property of the Tesseract library.

Example of utilization

Sample file

WaitTextAppear

WaitTextAppear

Overview

This activity card waits until the specified text appears on an image.

Application procedures

  1. Double-click No Target on the activity card.

  2. Specify the boundary for the image to search for.

  3. In the [Output] Output field, enter the variable to store the search result of the specified text.

  4. In the Properties window, specify other required properties.

Card output properties

Property

Type

Additional comments

Description

Example

Output

Boolean

-

Returns the result of the specified text appearing.※ true if text appears, false otherwise

true

Card input properties

Card properties

Common

Properties

Type

Required

Unit

Auto-setting

Description

Y

IMAGE

-

-

-

-

Common property of the Tesseract library.

Y

OCR

-

-

-

-

Common property of the Tesseract library.

N

Target Text

String

Y

-

-

Enter the text to wait for.

N

Index

Number

N

-

-

Enter the number to choose a text string if there are multiple instances of the same text strings.

Y

MORE OPTIONS

-

-

-

-

Common property of the Tesseract library.

Y

Description

-

-

-

-

Common property of the Tesseract library.

Example of utilization

Sample file

WaitTextDisappear

WaitTextDisappear

Overview

This activity card waits until the specified text disappears from an image.

Application procedures

  1. Double-click No Target on the activity card.

  2. Specify the boundary for the image to search for.

  3. In the [Output] Output field, enter the variable to store the disappearance result of the specified text.

  4. In the Properties window, specify other required properties.

Card output properties

Property

Type

Additional comments

Description

Example

Output

Boolean

-

Returns a result in which the specified text has disappeared.

※ true if the text disappears, false otherwise

true

Card input properties

Card properties

Common

Properties

Type

Required

Unit

Auto-setting

Description

Y

IMAGE

-

-

-

-

Common property of the Tesseract library.

Y

OCR

-

-

-

-

Common property of the Tesseract library.

N

Target Text

String

Y

-

-

Specify the text to wait for.

N

Index

Number

N

-

-

Enter the number to choose a text string if there are multiple instances of the same text strings.

Y

MORE OPTIONS

-

-

-

-

Common property of the Tesseract library.

Y

DESCRIPTION

-

-

-

-

Common property of the Tesseract library.

Example of utilization

Sample file