Artificial Intelligence (AI) is leveraged across the world today and some of the most cutting edge AI research is being led by Google and Facebook, which have access to large-scale data sets, have the processing power and are employing some of the best AI researchers to set new benchmarks in AI recognition. That is, until recently, when Elon Musk and team launched OpenAI, which makes its AI research open source, and Oppenheimer started Algorithmia that provides practical algorithms for businesses in an effort to make AI advancements available to larger groups of businesses. In fact, many companies, Parascript included, leverage AI across many fields ranging from image recognition, data extraction to security. What is one of the most promising frontiers of AI, and key to Parascript’s product roadmap, currently under development are deep learning algorithms.
Deep Learning and How We Think
Deep learning software conceptually mimics how humans think in that it is focused on thinking hierarchically and contextually. One distinction for deep learning software is that it creates layered description of input data. Groups of neurons focus on different fragments of the image starting from small local fragments, and each layer of neurons combines outputs of previous layers into more and more global/abstract descriptions. Due to processing power of modern GPUs, deep learning is able to process a very large amount of data. It creates and refines features based on millions and millions of samples of images and based on context. The same image is analyzed multiple times, and each time more is learned about the image. This isn’t so different from how a human learns. When a 14 year old reads a classic book, she picks up on certain themes and understands it within the context of her fourteen years of experience. When she returns to the same book at 20 and then at 30 years old, she is likely to find new layers of understanding because her experiences have broadened; the context has changed. Using both experience and background allow deep learning systems to become more accurate and nuanced over time.
Parascript uses both traditional shallow neural networks and modern deep learning. We apply deep learning algorithms to location and classification of objects. For example, imagine there’s a squirrel in the tree outside your window, you can generally identify the squirrel as a squirrel at any angle in motion, even with only a partial view of its tail. Deep learning provides the ability to classify and locate an object in every type of context. Classification and location are the two interconnected tasks that drive Parascript. Parascript applied deep learning to such challenging tasks as, for example, address recognition.
Deep Learning Applied to Address Recognition
Let’s take a task of reading an address on a parcel. Packages come from different countries, are made of different materials, have all kind of shapes, are often smashed, jammed, dirty, have lots of stickers and inscriptions on labels. Therefore, the task of finding the side which has a destination address, locating this address among all kinds of different stickers, inscriptions, background noise and distortions, and finally reading the address—that is not only written in various handwritings, but may have unusual abbreviations, wrong structure, be incomplete and have errors—are challenging tasks.
Early on, we applied neural networks to handwriting recognition and location—in particular, addresses. Deep learning algorithms allow the software to recognize addresses—even handwritten addresses—on average far better than humans. Deep learning adds accuracy to rote tasks. On challenges such as address recognition, we are far ahead of anyone else, and deep learning algorithms promise even better results.
Extracting Data from Documents
In extracting data from documents such as receipts, invoices, claims, mortgage applications and much more complex images, we are essentially solving for accuracy. These tasks require intensive work and learning by our software so that we can ensure the highest accuracy, improved recognition and reliable classification. Such tasks deal with ongoing changes and dynamic real-world situations where new images and documents are introduced and existing ones change.
Deep learning algorithms allow us to reliably work with altered or poor images, less information and more varied data. For example, in the past, it has been critical that an image of a check is taken on a solid background with good lighting and high resolution for accurate results. With deep learning, this level of initial care becomes much less relevant. There’s greater tolerance because deep networks learn to create adequate descriptions of images during training on very large amount of samples so that the system extracts the right data correctly from our dynamic world.
If you found this article interesting, you might also find this useful: