Google has now added machine learning (ML) capabilities to its Google BigQuery, the company’s petabyte (PB)-scale cloud database offering. Now dubbed BigQuery ML, the new version lets you use simple Structured Query Language (SQL) statements to build and deploy ML models for predictive analytics.
That’s not just good news for data scientists who use Google. It’s also good for business operators interested in advancing their data analytics capabilities because it adds one more effective competitor to a rather small list of vendors capable of delivering this level of sophistication via the cloud. The other two most well-known names are Amazon’s Relational Database Service and Microsoft’s Azure SQL, and you can find more in our recent cloud database service roundup.
The bane of all data product vendors and buyers has always been the skills gap. That’s been especially true for those interested in ML and predictive analytics, since these disciplines often require knowledge of new technologies and querying languages.
For every one data scientist, there are hundreds of analysts working with data, and most using SQL
Sudhir Hasbe, Director of Product Management at Google Cloud, told PCMag. Something had to give if the power of an army of data analysts was to be uncorked from the bottleneck created by too few and too overworked data scientists.
Google’s answer to this dilemma is nothing short of remarkable. While ML is a hot trend and showing up in products of all kinds everywhere, it’s still firmly data scientist territory. Plenty of vendors have made headway into simplifying the technology, but the ugly truth is, you can simplify it by a lot and it’s still too difficult for more than 99 percent of the human population to use. Yet, we need to be able to use it because ML can do more, and do it faster than a group of super-smart humans can.
Google is planting ML inside Google BigQuery so that it resides closer to the data. The application will bring ML capabilities faster than traditional ML models in part because the data analytics can be performed at the source. Now in beta, BigQuery ML enables analysts (and data scientists) to run predictive analytics such as forecasting sales and creating customer segments right on top of the data where it is stored. That alone is a respectable and a notable upgrade.
However, Google went further than that by adding a capability that enables data analysts to use simple SQL statements to build and deploy ML models. Right now, the options are linear regression and logistic regression models for predictive analysis as those are the two models most commonly used.Google plans to add more ML options to this capability over time, according to Hasbe.He said,
We need to hear from our customers on which models they want us to add so that we’re providing the most useful ones first
Additional Google BigQuery Upgrades
Topping the substantial list of upgrades after ML are a clustering capability, BigQuery Geographic Information Systems (BigQuery GIS), a new Google Sheets data connector, and a new Google Sheets data connector.
Clustering is also in beta, and enables the creation of clustered tables in a data optimization move that bunches rows with similar cluster keys together. This reduces costs since it improves performance and enables Google BigQuery to charge the user only for the data scanned rather than the entire table or partition.
BigQuery GIS is currently in alpha, and is used for geospatial data analysis. While the Google Cloud team partnered with Google Earth Engine to build BigQuery GIS, you have to bring your own geospatial data to the table. That’s not a problem in and across several industries, including connected car systems, the Internet of Things (IoT), manufacturing, retail, smart cities, and telematics. Not to mention government agencies ranging from the Environmental Protection Agency (EPA) and the National Geospatial-Intelligence Agency to the National Oceanic and Atmospheric Administration (NOAA) and all of the military branches, of course.
BigQuery GIS uses the S2 library, which now has over a billion users through a variety of products such as Google Earth Engine and Google Maps. If you need more geospatial data, then the federal government shares an immense amount of it on GeoPlatform.
A new Google Sheets data connector is likely to delight many data analysts simply because it’s so practical for daily use. You can access Google BigQuery from the Google Sheets (spreadsheet program) and use Google Sheets tools such as Explore, which is a combined collaboration, data visualization, and natural language querying tool.
Google BigQuery now has a new user interface (UI) in beta, too. One of the more interesting elements is one-click visualization functionality, which Google Data Studio supports. All told, it’s a great round of upgrades for an already elegant service. These upgrades will be tested in the next round of PCMag’s Database-as-a-Service (DBaaS) solution reviews, after the bugs are worked out, and the products have moved beyond their respective alpha and beta statuses.