Parallelization of Process of OCR for Book Digitization
Închide
Articolul precedent
Articolul urmator
58 1
Ultima descărcare din IBN:
2024-04-01 16:38
SM ISO690:2012
MAIDACENCO, Anastasia. Parallelization of Process of OCR for Book Digitization. In: StudMath-IT, Ed. 1, 23-24 noiembrie 2022, Arad. România, Arad: Editura Universităţii “Aurel Vlaicu” Arad, 2022, Ediția 1, pp. 23-24.
EXPORT metadate:
Google Scholar
Crossref
CERIF

DataCite
Dublin Core
StudMath-IT
Ediția 1, 2022
Conferința "StudMath-IT"
1, Arad, Romania, 23-24 noiembrie 2022

Parallelization of Process of OCR for Book Digitization


Pag. 23-24

Maidacenco Anastasia
 
"Alecu Russo" State University of Balti
 
 
Disponibil în IBN: 26 martie 2024


Rezumat

Over the past few decades, technology has advanced considerably in the area of digitizing archives. What was once a challenge to store data and facilitate retrieval has now become commonplace. One of the key factors behind this change is OCR. Optical Character Recognition (OCR) is the process of obtaining printed texts in a digitized format. There are millions of old books that are stored in vaults. The use of these books is forbidden due to their dilapidation and decrepitude, which is why the digitization of these books is so important. The main problem is cleaning the source from noise [1], recognizing the text in the image and converting it to text format. Which may also include the following steps: additional modification of the OCR mechanism itself and correction of post-processing errors. Neural networks are used to improve text recognition. An image already processed by a convolutional network is sent to the input of a multilayer perceptron. In this case, the training sample for this network will be different from the sample for the convolutional network, since the networks process the image differently. The convolutional network is considered the main network and removes most of the noise in the image, while the multilayer perceptron processes what the convolutional one could not handle. To speed up the process, it was proposed to use parallelization. First, it is necessary to check how many processes can be running on the device. After that, we can already run several threads, at different stages. The training cycle of the neural network includes a sequential pass through the training pairs, for each pair a forward move (calculation of the network output) and a reverse move (correction of weights and biases) are performed. These are the two parts of the body of the loop on training pairs, and a parallel approach can be used to optimize them [2]. As a result, on a typical neural network training task, we can get up to 50% speedup. The parallel approach also can be implemented on the stage with Pytesseract processing, where improvements can reach up to 15% [3]. But it is important to consider a few facts. The Gustafson-Barsis law, which is based on the assumption that the size of the task to be calculated grows linearly with the number of available processes. And Amdahl’s law assumes that the size of the problem is fixed. When new processors are added, they work on parts of the task that were originally handled by fewer processes. By adding more and more processes, their full capabilities are not being used, because eventually the size of what they can handle will bottom out. However, assuming that the size of the task increases with the number of processes added, all processes can be used at the desired level, and the speedup of the calculation being performed can be unlimited. The GustafsonBarsis law implies that we are only limited by the size of the task that we can compute with the resources of the added processes. However, there are other factors that affect acceleration [4]. It is not possible to add processes and hope to really speed up all types of tasks. Instead, it is necessary to choose the best way to parallelize the task in order to get the maximum performance increase from the available hardware in order to get the best calculation time when solving a computational problem.

Cuvinte-cheie
OCR, neural network, parallel programming, Gustafson-Barsis law, Amdahl´s law