Making sense of machine learning

As Matt Asay observed last week, AI appears to be reaching “peak ludicrous mode,” with almost every software vendor laying claim to today’s most hyped technology. In fact, Gartner’s latest Hype Cycle for Emerging Technologies places machine learning at the Peak of Inflated Expectations.

Hang on — see what I did there? I used “AI” and “machine learning” interchangeably, which should get me busted by the artificial thought police. The first thing you need to know about AI (and machine learning) is that it’s full of confusing, overlapping terminology, not to mention algorithms with functions that are opaque to all but a select few.

This combination of hype and nearly impenetrable nomenclature can get pretty irritating. So let’s start with a very basic taxonomy:

Artificial intelligence is the umbrella phrase under which all other terminology in this area falls. As an area of computer research, AI dates back to the 1940s. AI researchers were flush with optimism until the 1970s, when they encountered unforeseen challenges and funding dried up, a period known as “AI winter.” Despite such triumphs as IBM’s 1990s chess-playing system Deep Blue, the term AI did not recover really from its long winter until a few years ago. New nomenclature needed to be invented.

Machine intelligence is synonymous with AI. It never gained the currency AI did, but you never know when it might suddenly become popular.

Machine learning is the phrase you hear most often today, although it was first coined in the 1950s. It refers to a subset of AI in which programs feed on data and, by recognizing patterns in that data and learning from them, execute functions or make predictions without being explicitly programmed to do so. Most of the recent advances we hear about fall under the rubric of machine learning. Why is it so hot today? You often hear that Moore’s Law and cheap, abundant memory have given new life to old machine learning algorithms, which have led to a wave of practical applications, particularly relating to pattern recognition. That’s true, but even more important has been the hyperabundance of data to enable machine learning systems to learn.

Cognitive computing has been the phrase preferred by IBM and bestowed on its Jeopardy winner Watson. As best as I can determine, cognitive computing is more or less synonymous with AI, although IBM’s definition emphasizes human interaction with that intelligence. Some people object to the phrase because it implies human-like reasoning, which computer systems in their current form are unlikely to attain.

Neural networks are a form of machine learning dating back to early AI research. They very loosely emulate the way neurons in the brain work — the objective generally being pattern recognition. As neural networks are trained with data, connections between neurons are strengthened, the outputs from which form patterns and drive machine decisionmaking. Disparaged as slow and inexact during the AI winter, neural net technology is at the root of today’s excitement over AI and machine learning.

Deep learning is the hottest area of machine learning. In most cases, deep learning refers to many layers of neural networks working together. Deep learning has benefited from abundant GPU processing services in the cloud, which greatly enhance performance (and of course eliminate the chore of setting up GPU clusters on prem). All the major clouds — AWS, Microsoft Azure, and Google Cloud Platform — now offer deep learning frameworks, although Google’s TensorFlow is considered the most advanced. If you want a full explanation from someone who actually understands this stuff, read Martin Heller’s “What deep learning really means.” Also check out his comparative review of the six most popular machine/deep learning frameworks.

Despite the current enthusiasm for deep learning, most machine learning algorithms have nothing to do with neural nets. As I discovered a couple of years ago when I interviewed Dr. Hui Wang, senior director of risk sciences for PayPal, advanced systems often use deep learning in conjunction with linear algorithms to solve such major challenges as fraud detection. The almost unlimited ability to pile on not only deep learning layers, but also a wide variety of other machine learning algorithms — and apply them to a single problem — is one reason you’ve heard those cautionary verses about machine intelligence one day approaching human intelligence.

Just last December, another milestone was reached: An AI system known as DeepStack beat professional poker players for the first time at heads-up no-limit Texas hold’em poker, which unlike chess is a classic game of “imperfect” information (i.e., players have information that others do not have). That’s less than a year after Google’s AlphaGo system beat world champion Lee Sodol in the ancient Chinese game of Go.

So AI’s fever pitch is understandable, but Matt Asay’s ridicule still hits home. As with many hot trends, it’s all too easy to grandfather in prosaic existing technology (I mean, predictive text is technically AI). The other problem is that very few people understand much in this area beyond the superficial, including me. How many people who throw around phrases like k-means clustering or two-class logistic regression have a clear idea of what they’re talking about? For most of us, this is black box territory (though a good primer can help, such as this one from the University of Washington computer science department). Ultimately, it takes experts like Martin Heller, who along with polyglot programming skills has a Ph.D. in physics, to evaluate AI solutions.

This week promises to be a big one for AI and machine learning: Salesforce will be showing off its Einstein AI capability and at the Google Next conference, the agenda features no less than 20 sessions on machine learning. I was hoping to attend the “TensorFlow and Deep Learning without a Ph.D.” session, but it’s already full.

Source: InfoWorld Big Data