E-Commerce Veg-O-Matic

[Home] [ABCNews.com] [ColeGroup] [InfoWorld] [Money] [NYT]

September 22, 1999

A Faster Way of Shuffling the Data Cards

By STEVEN E. BRIER

l-go-rithm: 1. a repetitive procedure for solving a problem or accomplishing some end, especially by a computer; 2. the Veg-O-Matic of the data-mining world; it slices, it dices and it's faster than a speeding processor.

At its simplest, an algorithm is a set of instructions required to solve a problem. But at its best, an algorithm is a method of looking at information in ways that are not obvious to answer old questions better and faster, or to answer questions you didn't know you had.

"An algorithm is the teeny little engine that makes sense of masses of data," said Christopher Meyer, director of the Ernst & Young Center for Business Innovation in Cambridge, Mass.

Without algorithms, Meyer said, much of E-commerce would not be possible. Faster performance and unexpected results are what drive E-businesses, he said.

Online businesses, and their marketing departments, collect mammoth amounts of data about their customers. What they looked at, what they bought, where they came from, where they went, where they live and dozens of other little pieces of information are tracked. Companies want to analyze that information and use it to tune their pricing, product lineup and welcoming screens.

"If you have to have a market analyst looking at every person and seeing what to market to them, it would not be economically feasible," Meyer said. "But if you can mine the data, you can make a business."

Mining those data requires more than fast processors, said William Pulleyblank, director of Deep Computing at I.B.M.'s Thomas J. Watson Research Center. A well-designed algorithm can improve performance in a way that would take years through hardware. Pulleyblank uses two methods of sorting 1,000 checks into numerical order to illustrate the improvement in performance that a clever algorithm can provide.

Version one requires rifling through the checks looking for the check with the lowest number and pulling it to the front, then going back and looking for the check with the next-lowest number and putting it behind the first. This would work, but you could end up comparing numbers on more than half a million pairs of checks. That is effective but time consuming.

Now try something different. Instead of sorting one big stack, divide the pile of checks in half. Then divide one of those halves in half again, sort each part and merge them. Now do the same with the other half.

Confusing? Maybe, but you just sorted the checks using only about 20,000 comparisons instead of half a million. And the savings in time grow exponentially with the size of the pile.

That's an algorithm in action.

[Back to NYT] [Back to Stories]

[Privacy Policy] [Table of Contents] [Search]

© 1999-2012, Brier Associates LLC
19 Colgate Road
Maplewood, NJ 07040-2807