Oracle's next big business is selling your info

Oracle's next big business is selling your info

There’s a decent chance you’re part of Oracle’s next big business. Not selling products to you, but selling you as a product. That’s the idea behind the Oracle Data Cloud, a massive pool of information about consumers and companies.

The tech titan has put it together by tracking people across the web and buying data from a variety of sources. People who have their data included may not even know that they’ve opted in for that data collection.

There’s no big red button that someone has to click in order to be a part of the company’s data collection machine. Instead, its base of user data is fed by a network of third parties. The Data Cloud is primarily fed by three types of sources: publishers, like Forbes and Edmunds, retail loyalty programs, and traditional data brokers like Experian and IHS.

All of that adds up to a database of 5 billion consumer profiles, fed by 15 million data sources. Not every profile corresponds to a unique person — people can have multiple profiles — but Oracle has information on billions of people, according to Eric Roza, the vice president of Data Cloud. Using data science techniques, Oracle works to match activity from one browser to others, so companies can make sure the same ads get shown to people on their smartphones, tablets, and computers.

Oracle sees Data Cloud as a key part of its future. The service is being used to help advertisers and publishers better target ads, and it’s attractive to businesses because it’s not tied to a major advertising platform like Google’s or Facebook’s.

The Data Cloud also forms the foundation of machine learning features inside other Oracle software. One of the challenges for companies doing machine learning is getting data sets that are large enough to build accurate models, and Data Cloud can help solve that problem.

But the benefits are mostly borne by Oracle’s business customers, who stand to make more money as a result of using Data Cloud enhanced services. The boon to consumers whose data are being used is less defined.

Oracle isn’t alone in this sort of tracking. There are dozens of companies that exist for the sole purpose of collecting consumer data and then reselling that to other businesses. Google, Facebook, Microsoft, and other tech titans have made big money from accumulating customer data and using it to sell ads.

But what makes the Data Cloud different from something like Google’s ad business is that consumers might not know their behavior is being stored for resale, or how broadly it’s shared. Just because someone visits a page on Forbes doesn’t mean they’d expect that information to influence a marketing campaign on a radically different website, but that’s what the Data Cloud enables.

Partners feeding data into Oracle’s Data Cloud must agree they have user permission to collect information. But acquiring that permission is as simple as burying a few sentences deep in a privacy policy. While some might call out Oracle Data Cloud by name, most don’t. 

“Typically, because these things are quite common practice now, there’s a more generalized statement [like] some version of ‘we use this data to inform our own advertising, and select third-party partners,'” Roza said.

Users can opt out from the data collection in a variety of ways, according to Roza. Oracle allows people to install a special cookie in each of the browsers they use to prevent tracking. Deleting the cookie or using a new browser would erase that protection, however. Some publishers may allow customers to opt out of data sharing, and advertising industry groups also support opting out.

But actually knowing whether or not you’re included in the Data Cloud is the first part of the battle. And that’s not the easiest thing to figure out. Meanwhile, Oracle is continuing to pour money into the business and tout it to customers. The company has spent billions on acquisitions to build the Data Cloud, which was created through bringing companies like BlueKai, Datalogix, and Moat into the fold.

Source: InfoWorld Big Data

Google's new cloud service eases data preparation for machine learning

Google's new cloud service eases data preparation for machine learning

One of the challenges that data scientists face when running machine learning workloads is processing information before it’s ready for use. Google unveiled a new cloud service Thursday aimed at easing that pain.

Google Cloud Dataprep will automatically detect data schemas, joins, and anomalies like missing or duplicate values, without requiring coding. After that, it will help users build a set of rules for processing the information. Those rules are then built in Apache Streams format and can be imported into products like Google’s Cloud Dataflow for processing information as it’s imported into services like the BigQuery data warehouse service.

While Cloud Dataprep is built to prepare data for machine learning, the system also uses machine learning itself to try to determine which rules will be most useful for customers. As of Thursday, it’s available in private beta.

BigQuery is receiving a number of enhancements as well, including a new Commercial Datasets program that’s now available in public beta. It will let users take information from AccuWeather, Dow Jones, Xignite, HouseCanary, and Remine and directly feed it into BigQuery for further processing.

BigQuery can also now query data stored in Cloud Bigtable, Google’s managed NoSQL database offering for low-latency data. That means users can write one SQL query that can tap into information from Bigtable and BigQuery. In the past, they’d have to write a program to search Bigtable.

Advertising customers will be able to send data from Google Adwords, DoubleClick Campaign Manager, DoubleClick for Publishers, and YouTube to BigQuery for further use in analytics and other big data applications. That feature may help encourage the company’s fleet of advertising customers to try Google’s Cloud as it faces down Amazon and Microsoft.

Speaking of database news, the company announced that its Cloud SQL managed database offering now offers beta support for PostgreSQL in addition to MySQL.

All of the news was announced as part of Google Cloud Next, the company’s user conference for businesses and enterprises taking place in San Francisco. The announcements come alongside other news about the company’s cloud platform, including changes to pricing and support for custom runtimes in AppEngine.

Source: InfoWorld Big Data

Tech luminaries team up on $27M AI ethics fund

Tech luminaries team up on M AI ethics fund

Artificial intelligence technology is becoming an increasingly large part of our daily lives. While those developments have led to cool new features, they’ve also presented a host of potential problems, like automation displacing human jobs, and algorithms providing biased results.

Now, a team of philanthropists and tech luminaries have put together a fund that’s aimed at bringing more humanity into the AI development process. It’s called the Ethics and Governance of Artificial Intelligence Fund, and it will focus on advancing AI in the public interest.

A fund such as this one is important as issues arise during AI development. The IEEE highlighted a host of potential issues with artificial intelligence systems in a recent report, and the fund seems aimed at funding solutions to several of those problems.

Its areas of focus include research into the best way to communicate the complexity of AI technology, how to design ethical intelligent systems, and ensuring that a range of constituencies is represented in the development of these new AI technologies.

The fund was kicked off with help from Omidyar Network, the investment firm created by eBay founder Pierre Omidyar; the John S. and James L. Knight Foundation; LinkedIn founder Reid Hoffman; The William and Flora Hewlett Foundation; and Jim Pallotta, founder of the Raptor Group.

“As a technologist, I’m impressed by the incredible speed at which artificial intelligence technologies are developing,” Omidyar said in a press release. “As a philanthropist and humanitarian, I’m eager to ensure that ethical considerations and the human impacts of these technologies are not overlooked.”

Hoffman, a former executive at PayPal, has shown quite the interest in developing AI in the public interest and has also provided backing to OpenAI, a research organization aimed at helping create AI that is as safe as possible.

The fund will work with educational institutions, including the Berkman Klein Center for Internet and Society at Harvard University and the MIT Media Lab. The fund has US $27 million to spend at this point, and more investors are expected to join in.

Source: InfoWorld Big Data

Microsoft SQL Server 2016 finally gets a release date

Microsoft SQL Server 2016 finally gets a release date

Database fans, start your clocks: Microsoft announced Monday that its new version of SQL Server will be out of beta and ready for commercial release on June 1. 

The news means that companies waiting to pick up SQL Server 2016 until its general availability can start planning their adoption.

SQL Server 2016 comes with a suite of new features over its predecessor, including a new Stretch Database function that allows users to store some of their data in a database on-premises and send infrequently used  data to Microsoft’s Azure cloud. An application connected to a database using that feature can still see all the data from different sources, though. 

Another marquee feature is the new Always Encrypted function, which makes it possible for users to encrypt data at the column level both at rest and in memory. That’s still only scratching the surface of the software, which also supports creating mobile business intelligence dashboards and new functionality for big data applications.

Microsoft is making big data really small using DNA

Microsoft is making big data really small using DNA

Microsoft has partnered with a San Francisco-based company to encode information on synthetic DNA to test its potential as a new medium for data storage. 

Twist Bioscience will provide Microsoft with 10 million DNA strands for the purpose of encoding digital data. In other words, Microsoft is trying to figure out how the same molecules that make up humans’ genetic code can be used to encode digital information. 

While a commercial product is still years away, initial tests have shown that it’s possible to encode and recover 100 percent of digital data from synthetic DNA, said Doug Carmean, a Microsoft partner architect, in a statement.

Using DNA could allow massive amounts of data to be stored in a tiny physical footprint. Twist claims a gram of DNA could store almost a trillion gigabytes of data.