Big Challenge for A Postal Behemoth
COMPUTERS FOR THE REST OF US
COMPUTERS get hung up on the simplest things. Consider the United States Postal Service. It handles 40 percent of the world's mail - half a billion letters and packages a day. Its $5 billion worth of automated equipment sorts 12 letters a second.
Yet, this technological behemoth stumbles badly when confronted by an ordinary handwritten envelope. It can't read the address. Despite many advances, handwriting recognition remains one of technology's biggest stumbling blocks.
The Postal Service is trying to get around the problem by funding research at several institutions, in Europe as well as the US, including the State University of New York at Buffalo. On a recent tour, researchers let me test one of their prototype solutions by writing my name and address on an envelope. Since my handwriting isn't particularly neat (a kind of early Phoenician scrawl), I wondered if the system would blow a fuse. Here's what I wrote:
Laurent Belsie
40 E. Walnut
Washington, PA 15301
The system scanned the envelope, ignored the printed return address, and correctly identified my block of writing as the address. It divided the block into lines and words, found the zip code, and correctly deciphered its five digits. It even picked out the street number ``40'' from the second line of my address.
Pretty good for a machine - but not for the Postal Service, which wants a system that can read enough of the address to figure out its nine-digit zip code. By 1995, the Postal Service expects to sort 90 percent of machine-printed letters automatically (up from just over 56 percent today) and 50 percent of handwritten letters (up from a negligible 4 percent today). The new system will also have to run at the current process time of 12 letters a second.
That's 25 times faster than the best prototype from the Buffalo research center, where researchers are using a technique known as contextual analysis.
People use this all the time without knowing it. For example, my handwritten ``Washington'' looks suspiciously like ``Washinglon'' because I didn't cross my ``t'' very well. Most Americans would read it as ``Washington'' because they know it's a common name. The computer has to use other clues.
Typically, it might read the zip code first and then look into its huge 3 billion character database to find that the zip corresponds to Washington, Pa. That would allow it to make a quick guess about ``Washinglon.'' By figuring out ``40'' on the second line, it could narrow its choice of streets to those in Washington that have the number 40. If it figured out that ``E.'' was east, it could further narrow its choices. And so on.
Making all this work coherently will take time. Its potential payoffs are big.
More mail would be automatically sorted (which costs $3 per 1,000 letters) rather than mechanically sorted by an operator ($15 per 1,000) or by hand ($35 per 1,000). Once sorted, all letters would be imprinted with a nine-digit bar code.
The bar codes, in turn, would sort mail according to the carrier's route. Thus, mail carriers could spend two hours a day sorting and six hours delivering - rather than today's average of four hours on each task. Estimated savings in processing mail: $230 million to $260 million a year.