Google adds standard SQL support to BigQuery

Google has rolled out a new beta of BigQuery, the data warehousing and analysis service available on Google Cloud Platform, which lets admins use standard SQL instead of one of its dialects.
BigQuery’s revamped SQL dialect replaces the existing dialect as the default query language. The new default is is fully compliant with the 2011 SQL standard and supports extensions that support queries of nested and repeated data. When BigQuery first launched in 2012, it was possible to query data with the SQL-like BQL, which stands for “BigQuery Query Language”. Now Google is bringing full SQL to the service, a significant development due to the pervasiveness of ANSI SQL, which is the lingua franca for data analysts.

Google describes BigQuery as a “fully managed, petabyte scale, low-cost analytics data warehouse”. The service is priced on a pay-as-you-use model, and Google believes the new SQL dialect will help open up the service to legions of new developers.
“If you’re familiar with the SQL standard or have used another standard-compliant SQL engine, you’ll feel right at home with the beta of BigQuery’s revamped SQL dialect,” Google technical lead and manager Dan Delorey and technical product marketing manager Bosco Zubiaga wrote in a blog post.
Here are some of the benefits of the new dialect, taken straight from Google’s post:
More advanced query planning and optimization: BigQuery now provides more robust decorrelation, which allows you to write complex subqueries in any clause of your SQL statement (SELECT, FROM, WHERE and so on).
A richer type system with fully composable types: In addition to the existing data types BigQuery users are used to, we’ve added dates, times, arrays and structs, as well as additional support for timestamps.
Extended JOIN support: BigQuery now supports Theta JOIN, which offers the ability to use inequalities in your join key comparisons, as well as arbitrary expressions as JOIN conditions.

Google’s blog post took pains to say that the new dialect is a beta release, and needs time to mature before it can be used in real-world cases.
“While we think the updated dialect is a wonderful addition, there’s no requirement that users switch, and for production use cases, we recommend users remain on the legacy SQL dialect,” the authors said. “After we have a few more miles on the new dialect, we plan to launch it to general availability and recommend it as the default language for all projects.”
Google provided no time frame on when that might happen, but it did reveal a bunch of new, mature features that developers can feel free to play around with. The highlight of these new features is enhanced identity and access management (IAM), which is also in beta.
“Now that BigQuery supports Standard SQL, you’ll have more and more teams in your company requesting access to BigQuery projects,” Delorey and Zubiaga said. “Earlier this year we announced Cloud IAM for Cloud Platform. We’re now making IAM available for BigQuery as well, in beta. This feature is currently being rolled out, so if you don’t see it today you can expect it enabled on your projects this month — BigQuery roles will be made available in Google Cloud’s ‘IAM & Admin’ control panel.”
A second new feature is called time-based table partitioning, which according to Google, “makes it easy and cost-effective for you to manage your data and write queries that span multiple days, months or years. You can now create tables with time-based partitions – you load the data, and BigQuery will automatically put it in the right partition.”
Google says that taken together, the new features make the service more compatible with traditional Big Data workflows.

Post a Comment

Previous Post Next Post