First neural network to identify handwritten characters

Who: Yann LeCun, Bell Labs Convolutional Neural Network
What: First
Where: United States (Holmdel)
When: 1989

The first neural network capable of identifying handwritten characters was a convolutional neural network designed by Yann LeCun (FRA) and his colleagues at AT&T Bell Labs in Holmdel, New Jersey, USA, in 1989 .

Digitizing the content of handwritten text was one of the earliest proposed applications of neural network technology. The creator of the first trainable neural network, Frank Rosenblatt, mentioned handwriting recognition in interviews as early as 1958. The US Postal service was one of the first organizations to show an interest in Rosenblatt's work on "Perceptrons", seeing it as a potential route to a machine that could automatically read addresses on envelopes.

The challenge here is that handwritten letters, while easily recognizable to literate humans, vary enormously in shape and form, making it impossible for a conventional pattern-matching program to identify them reliably. Even when written by someone with exceptional penmanship, the shape of a letter will vary depending on the context of the letters that come before and after it.

Despite the hopes of its creator, Rosenblatt's Perceptron Mark I was never able to reliably identify printed letters, let alone handwritten ones. The challenge required networks with more layers and more neurons on each layer. Rosenblatt made some progress towards this goal, but was stymied by the relatively low power of 1960s computers and the limitations of the learning algorithms he was using.

Although neural networks fell from favour among many AI researchers in the 1970s, there was a small but persistent group who continued to build on Rosenblatt's ideas. Techniques such as back-propagation (which allows a network to go back through its workings and adjust neuron weights after comparing its output to the correct answer in the training data) and convolutional networks (which allow networks to effectively process complex data with many fewer neurons) made it possible to create what are known as "deep learning" networks, with many interconnected layers of neurons.

Yann LeCun joined AT&T Bell Labs in 1988, having previously studied with neural network pioneer Geoffrey Hinton (later a leading AI researcher at Google). He immediately set to work trying to apply his research on convolutional networks to the age-old problem of optical character recognition.

In 1989, the US Postal Service provided LeCun's team with a set of 9,298 scanned handwritten zip codes from mail that passed through a sorting office in Buffalo, New York. The researchers used 7,291 of these scans to train the neural network, and the other 2,007 to test its effectiveness. The result was a success rate of 95%, which paved the way for the system's widespread adoption in the early 1990s.