Handwriting recognition continues to offer interesting challenges to businesses and the scientific community. Recently, Sergey Polovinkin, a Computer Vision Engineer in the Ukraine at IDR, gave a tech talk to data scientists there about handwriting recognition. He shared some of the details with us.
Parascript: Would you give us a few of the highlights from your recent Tech Talk on Handwriting Recognition to data scientists in the Ukraine?
Sergey Polovinkin: My talk was divided into two parts. The first part dealt with existing solutions for data extraction and how FX Capture differs from other market players. I talked about the various projects that I faced and how they solved them.
The second part of the talk was specialized and focused on one of the big and very interesting projects (we are talking about 60 million forms). The challenge is that on these documents there is handwriting on top of printed text. As a result, it is impossible to recognize either printed text nor handwritten text. It is necessary to separate them from each other, and only then to recognize them separately. I used the library of computer vision (OpenCV) and machine learning to move beyond standard means of separation. In this part of my talk, I focused on the features for the classifier: Gabor filters, print text zones, Euclidean pixel-to-background color distance and tone range. My level of separation is somewhere around 90%, and this is not enough so I continue to work on this (now using neural networks).
Parascript: Tell us a little bit about yourself and your background. How did you become interested in handwriting recognition?
SP: I am a programmer and began working on handwriting recognition about 7 years ago. I worked on a project for the remote collection of information from outlets for one large client in Kharkov. It was necessary to read information about sales from cash registers and transmit it remotely to the office for further analysis. The problem that they faced was that the cash register wasn’t aware of the current balances at the outlet, so about 3,000 inventory sheets were sent weekly from the outlets. For their processing, it took 4 working days and about 6-8 employees. This work had to be automated, and I was selected to do it. At first I myself wrote something and hoped to solve the problem.
However, this topic is very complex even for a team of programmers and requires a large number of person-years. I then began to test in detail the available solutions. It’s easy to guess that the Parascript solution won in the test. (I also tested Abbyy and an Israeli company).
Parascript: What are some of the biggest challenges that you’ve seen in handwriting recognition?
SP: It is difficult to single out a particular problem. They are enough and they are always different for individual projects. Sometimes it’s segmentation, sometimes garbage disposal, sometimes poor accuracy in recognition for certain handwritten documents. A separate layer is the problem of recognizing natural handwritten text without a dictionary or with a dictionary with a large number of answers (surnames, email, passwords).
Parascript: What do you see as the future of data capture in the next 3-5 years and over the next ten years?
SP: This question can be answered in two ways. The tasks of machine learning are increasingly shifting from the scientific community to engineering. Now there are tools and libraries available for analyzing large data (sklearn, tensorflow, keras). More people are interested in this topic. Data scientists are very much in demand. Therefore, mini-revolutions will be quite often and this will affect data capture.
However, humankind does not yet know how the brain really works and how to imitate it. We are still at the foot of this knowledge mountain. So, the super-AI-revolution in the near future may have to wait.
Reading Handwriting and Recognizing Faces
I often quote the words of Christopher Frith from Making up the Mind: How the Brain Creates our Mental World: “In 1956, the science of creating devices capable of doing various ingenious things was called, ‘artificial intelligence.’” The scientific research into AI, like any other, assumed that it was necessary to begin with the solution of the lightest problems. The perception of the surrounding world seemed a relatively easy matter. Most people can easily read handwritten text and recognize faces. At first, it seemed that creating a machine capable of reading handwriting and recognizing faces should also be fairly straightforward.
The game of chess, on the contrary, was considered very difficult. Very few people can play chess at the Grandmaster’s level. Initially, scientists thought it was better to postpone the creation of machines that can play chess. Fifty years have passed, and a computer designed for playing chess won the world championship. The problem of teaching AI perception, such as reading handwriting and recognizing faces has proven very difficult.
###
If you found this article interesting, you might find the following eBook helpful, AI Data Automation: Advances in Handwriting Recognition.