Data is eating the software that is eating the world

Data is eating the software that is eating the world

No one doubts that software engineering shapes every last facet of our 21st century existence. Given his vested interest in companies whose fortunes were built on software engineering, it was no surprise when Marc Andreessen declared that “software is eating the world.”

But what does that actually mean, and, just as important, does it still apply, if it ever did? These questions came to me recently when I reread Andreessen’s op-ed piece and noticed that he equated “software” with “programming.” Just as significant, he equated “eating” with industry takeovers by “Silicon Valley-style entrepreneurial technology companies” and then rattled through the usual honor roll of Amazon, Netflix, Apple, Google, and the like. What they, and others cited by Andreessen, have in common is that they built global-scale business models on the backs of programmers who bang out the code that drives web, mobile, social, cloud, and other 24/7 online channels.

Since the piece was published in the Wall Street Journal in 2011, we’ve had more than a half-decade to see whether Andreessen’s epic statement of Silicon Valley triumphalism proved either prescient or, perhaps, merely self-serving and misguided. I’d say it comes down more on the prescient end of the spectrum, due to the fact that most (but not all) of the success stories he cited have continued their momentum in growth, profitability, acquisitions, innovation, and so forth. People from programming backgrounds – such as Mark Zuckerberg – are indeed the multibillionaire rockstars of this new business era. In this way, Andreessen has so far been spared the fate of Tom Peters, who saw many of the exemplars he cited in his 1982 bestseller “In Search of Excellence” go on to be deconstructed by business rivals or blindsided by trends they didn’t see coming.

Rise of the learning machines

However, it has become clear to everyone, especially the old-school disruptors cited by Andreessen, that “software,” as it’s normally understood, is not the secret to future success. Going forward, the agent of disruption will be the data-driven ML (machine learning) algorithms that power AI. In this new era, more of the logic that powers intelligent applications won’t be explicitly programmed. The days of predominantly declarative, deterministic, and rules-based application development are fast drawing to a close. Instead, the probabilistic logic at the heart of chatbots, recommendation engines, self-driving vehicles, and other AI-powered applications is being harvested directly from source data.

The “next best action” logic permeating our lives is evolving inside of applications through continuous inference from data originating in the Internet of Things and other production applications. Consequently, there will be a diminishing need for programmers, in the traditional sense of people who build hard-and-fast application logic. In their place, the demand for a new breed of developer – the data scientist – will continue to grow. This term refers to the wide range of specialists who craft, train, and manage the regression models, neural networks, support vector machines, unsupervised learning models, and other ML algorithms upon which AI-centric apps depend.

To compound the marginalization of programmers in this new era, we’re likely to see more ML-driven code generation along the lines that I discussed in this recent post. Amazon, Google, Facebook, Microsoft, and other software-based powerhouses have made huge investments in data science, hoping to buoy their fortunes in the post-programming era. They all have amassed growing sets of training data from their ongoing operations. For these reasons, the “Silicon Valley-style” monoliths are confident that they have the resources needed to build, tune, and optimize increasingly innovative AI/ML-based algorithms for every conceivable application.

However, any strategic advantages that these giants gain from these AI/ML assets may be short-lived. Just as data-driven approaches are eroding the foundations of traditional programming, they’re also beginning to nibble at the edges of what highly skilled data scientists do for a living. These trends are even starting to chip away at the economies of scale available to large software companies with deep pockets.

AI and the Goliaths

We’re moving into an era in which anyone can tap into cloud-based resources to cheaply automate the development, deployment, and optimization of innovative AI/ML apps. In a “snake eating its own tail” phenomenon, ML-driven approaches will increasingly automate the creation and optimization of ML models, per my discussion here. And, from what we’re seeing in research initiatives such as Stanford’s Snorkel project, ML will also play a growing role in automating the acquisition and labeling of ML training data. What that means is that, in addition to abundant open-source algorithms, models, code, and data, the next-generation developer will also be able to generate ersatz but good-enough labeled training data on the fly to tune new apps for their intended purposes.

As the availability of low-cost generative training data grows, the established software companies’ massive data lakes, in which their developers maintain petabytes of authentic from-the-source training data, may become more of an overhead burden than a strategic asset. Likewise, the need to manage the complex data-preparation logic for use of this source data may become a bottleneck that impedes the ability of developers to rapidly build, train, and deploy new AI apps.

When any developer can routinely make AI apps just as accurate as Google’s or Facebook’s – but with far less expertise, budget, and training data than the big shots – a new era will have dawned. When we reach that tipping point, the next generation of data-science-powered disruptors will start to eat away at yesteryear’s software startups.

Source: InfoWorld Big Data

'Transfer learning' jump-starts new AI projects

'Transfer learning' jump-starts new AI projects

No statistical algorithm can be the master of all machine learning application domains. That’s because the domain knowledge encoded in that algorithm is specific to the analytical challenge for which it was constructed. If you try to apply that same algorithm to a data source that differs in some way, large or small, from the original domain’s training data, its predictive power may fall flat.

That said, a new application domain may have so much in common with prior applications that data scientists can’t be blamed for trying to reuse hard-won knowledge from prior models. This is a well-established but fast-evolving frontier of data science known as “transfer learning” (but goes by other names such as knowledge transfer, inductive transfer, and meta learning).

Transfer learning refers to reuse of some or all of the training data, feature representations, neural-node layering, weights, training method, loss function, learning rate, and other properties of a prior model.

Transfer learning is a supplement to, not a replacement for, other learning techniques that form the backbone of most data science practices. Typically, a data scientist relies on transfer learning to tap into statistical knowledge that was gained on prior projects through supervised, semi-supervised, unsupervised, or reinforcement learning.

For data scientists, there are several practical uses of transfer learning.

Modeling productivity acceleration

If data scientists can reuse prior work without the need to revise it extensively, transfer-learning techniques can greatly boost their productivity and accelerate time to insight on new modeling projects. In fact, many projects in machine learning and deep learning address solution domains for which there is ample prior work that can be reused to kick-start development and training of fresh neural networks.

It is also useful if there are close parallels or affinities between the source and target domains. For example, a natural-language processing algorithm that was built to classify English-language technical documents in one scientific discipline should, in theory, be readily adaptable to classifying Spanish-language documents in a related field. Likewise, deep learning knowledge that was gained from training a robot to navigate through a maze may also be partially applicable to helping it learn to make its way through a dynamic obstacle course.

Training-data stopgap

If a new application domain lacks sufficient amounts of labeled training data of high quality, transfer learning can help data scientists to craft machine learning models that leverage relevant training data from prior modeling projects. As noted in this excellent research paper, transfer learning is an essential capability to address machine learning projects in which prior training data can become easily outdated. This problem of training-data obsolescence often happens in dynamic problem domains, such as trying to gauge social sentiment or track patterns in sensor data.

An example, cited in the paper, is the difficulty of training the machine-learning models that drive Wi-Fi indoor localization, considering that the key data—signal strength—behind these models may vary widely over the time periods and devices used to collect the data. Transfer learning is also critical to the success of IoT deep learning applications that generate complex machine-generated information of such staggering volume, velocity, and variety that one would never be able to find enough expert human beings to label enough of it to kick-start training of new models.

Risk mitigation

If the underlying conditions of the phenomenon modeled have radically changed, thereby rendering prior training data sets or feature models inapplicable, transfer learning can help data scientists leverage useful subsets of training data and feature models from related domains. As discussed in this recent Harvard Business Review article, the data scientists who got the 2016 U.S. presidential election dead wrong could have benefited from statistical knowledge gained in postmortem studies of failed predictions from the U.K. Brexit fiasco.

Transfer learning can help data scientists mitigate the risks of machine-learning-driven predictions in any problem domain susceptible to highly improbable events. For example, cross-fertilization of statistical knowledge from meteorological models may be useful in predicting “perfect storms” of congestion in traffic management. Likewise, historical data on “black swans” in economics, such as stock-market crashes and severe depressions, may be useful in predicting catastrophic developments in politics and epidemiology.

Transfer learning isn’t only a productivity tool to assist data scientists with their next modeling challenge. It also stands at the forefront of the data science community’s efforts to invent “master learning algorithms” that automatically gain and apply fresh contextual knowledge through deep neural networks and other forms of AI.

Clearly, humanity is nowhere close to fashioning such a “superintelligence” — and some people, fearing a robot apocalypse or similar dystopia, hope we never do. But it’s not far-fetched to predict that, as data scientists encode more of the world’s practical knowledge in statistical models, these AI nuggets will be composed into machine intelligence of staggering sophistication.

Transfer learning will become a membrane through which this statistical knowledge infuses everything in our world.

Source: InfoWorld Big Data

Deep learning is already altering your reality

Deep learning is already altering your reality

We now experience life through an algorithmic lens. Whether we realize it or not, machine learning algorithms shape how we behave, engage, interact, and transact with each other and with the world around us.

Deep learning is the next advance in machine learning. While machine learning has traditionally been applied to textual data, deep learning goes beyond that to find meaningful patterns within streaming media and other complex content types, including video, voice, music, images, and sensor data.

Deep learning enables your smartphone’s voice-activated virtual assistant to understand spoken intentions. It drives the computer vision, face recognition, voice recognition, and natural language processing features that we now take for granted on many mobile, cloud, and other online apps. And it enables computers—such as the growing legions of robots, drones, and self-driving vehicles—to recognize and respond intelligently and contextually to the environment patterns that any sentient creature instinctively adapts to from the moment it’s born.

But those analytic applications only scratch the surface of deep learning’s world-altering potential. The technology is far more than analytics that see deeply into environmental patterns. Increasingly, it’s also being used to mint, make, and design fresh patterns from scratch. As I discussed in this recent post, deep learning is driving the application logic being used to create new video, audio, image, text, and other objects. Check out this recent Medium article for a nice visual narrative of how deep learning is radically refabricating every aspect of human experience.

These are what I’ve referred to as the “constructive” applications of the technology, which involve using it to craft new patterns in new artifacts rather than simply introspecting historical data for pre-existing patterns. It’s also being used to revise, restore, and annotate found content and even physical objects so that they can be more useful for downstream uses.

You can’t help but be amazed by all this until you stop to think how it’s fundamentally altering the notion of “authenticity.” The purpose of deep learning’s analytic side is to identify the authentic patterns in real data. But if its constructive applications can fabricate experiences, cultural artifacts, the historical record, and even our bodies with astonishing verisimilitude, what is the practical difference between reality and illusion? At what point are we at risk of losing our awareness of the pre-algorithmic sources that should serve as the bedrock of all experience?

This is not a metaphysical meditation. Deep learning has advanced to the point where:

Clearly, the power to construct is also the power to reconstruct, and that’s tantamount to having the power to fabricate and misdirect. Though we needn’t sensationalize this, deep learning’s reconstructive potential can prove problematic in cognitive applications, given the potential for algorithmic biases to cloud decision support. If those algorithmic reconstructions skew environmental data too far from bedrock reality, the risks may be considerable for deep learning applications such as self-driving cars and prosthetic limbs upon which people’s very lives depend.

Though there’s no stopping the advance of deep learning into every aspect of our lives, we can in fact bring greater transparency into how those algorithms achieve their practical magic. As I discussed in this post, we should be instrumenting deep learning applications to facilitate identification of the specific algorithmic path (such as the end-to-end graph of source information, transformations, statistical models, metadata, and so on) that was used to construct a specific artifact or take a particular action in a particular circumstance.

Just as important, every seemingly realistic but algorithmically generated artifact that we encounter should have that fact flagged in some salient way so that we can take that into account as we’re interacting with it. Just as some people wish to know if they’re consuming genetically modified organisms, many might take interest in whether they’re engaging with algorithmically modified objects.

If we’re living in an algorithmic bubble, we should at the very least know how it’s bending and coloring whatever rays of light we’re able to glimpse through it.

Source: InfoWorld Big Data

Know when your big data is telling big lies

Know when your big data is telling big lies

Data scientists use statistical analysis tools to find non-obvious patterns in deep data. But they know the universe is full of spurious correlations. Big data simply intensifies the problem.

Because, as the range of sources and the diversity of predictors continues to grow, the number of relationships that can potentially be modeled begins to approach infinity. As David G. Young pointed out, “predictive variables sometimes aren’t ….We’ve all seen variable interactions that change the significance, curvature, and even the sign of an important predictor.”

Thus, if you’re looking for a particular correlation in your data, you can probably find it if you’re clever enough to combine only the right data, specify only the right variables, and analyze at using only the right algorithm. Once you’ve hit on the right combination of modeling decisions, the patterns you seek may pop out like a genie from Aladdin’s lamp.

Yet the fact that you’ve supposedly discovered this correlation doesn’t mean it actually exists in the underlying real-world domain you’re investigating. It may simply be a figment of your specific approach to modeling the data you have at hand. You may have no fraudulent intent, and you may otherwise adhere to standard data-scientific methodologies, but you may choose to go no further if it appears you’ve already struck the pay dirt insight you were seeking.

How to monetize the fuzzy narratives of social listening

How to monetize the fuzzy narratives of social listening

Marketing professionals, such as yours truly, use social-listening analytics tools in the hope that they reveal whether customers are likely to stay loyal, buy more stuff, and say nice things about our companies and products. What these tools reveal is how people might or might not be leaning in the aggregate, under the questionable assumption that social media users are a cross-section of the target population you’re trying to engage.

Even if your entire target market were on social media, you’d be ill-advised to accept social intelligence as an indicator of how individuals truly feel about your brand. As I’ve stated, few customers declare their feelings in the form of tweets or Facebook updates that represent their semiofficial opinion on the topic. Even if people aren’t lying, everyday speech is full of ambiguity, vagueness, situational context, sarcasm, elliptical speech, and other linguistic complexities that may obscure the full truth of what they’re trying to say. 

What we truly want from social listening is what we simply aren’t getting. What we’re actually getting is a blizzard of aggregated social metric data that measure any or all of the following:

  • Social buzz: Many listening tools specialize in measuring aggregated social buzz by keywords, topics, hashtags, and conversations. The metrics might also show how the buzz shakes out into sentiment and “share of voice” by brand. It might even show difference in the buzz by social channels, geographies, demographics, influencers, day of week, and other such dimensions.
  • Social reach: Listening tools might help you assess the followership of your specific social channels and impressions of your social postings across geographies, demographics, influencers, and so on.
  • Social engagement: The tools might indicate the extent to which your social postings have driven shares, likes, replies, clickthroughs, and other indicators of customer involvement and sentiment with your brands, campaigns, and products.

When presented individually or in various visually compelling formats, those numbers can tell a wide range of stories. However, what social listening tools rarely present is a statistically validated causal narrative that we can use to predictively recalibrate our social marketing tactics. In the abstract, such a narrative might be structured as follows: “Social listening metric A showed that marketing tactic B created conditions C under which customer D expressed positive sentiments about, actually purchased, or recommended that others purchase product E under circumstances F and are highly likely to cause them or customers like them do so again under similar circumstances.”