This is a summary of all the announcements we heard discussed during Google Data Summit 2022. Hopefully, this can help you easily to find all the updates.
Table of contents
- Vertex AI Workbench (GA), bringing together Google Cloud’s data and ML systems into a single interface so that teams have a common toolset across data analytics, data science, and machine learning.
- With native integrations across BigQuery, Spark, Dataproc, and Dataplex
- Data scientists can build, train and deploy ML models 5X faster than traditional notebooks.
- Fully managed compute with admin controls (serverless, scalable, enterprise ready Jupiterlab)
- MLOps tooling for AI
- Introducing Vertex AI Model Registry, a central repository to manage and govern the lifecycle of your ML models. Designed to work with any type of model and deployment target, including BigQuery ML, Vertex AI Model Registry makes it easy to manage and deploy models.
- BQML is added to Vertex AI Model Registry (Public Preview), With this integration, BQML users can leverage Vertex AI for MLOps. Specifically, users can:
- Register BQML models with MR
- Deploy BQML models directly from Vertex MR to Vertex Deployment end points
- Powerful AI compute without movement of data – Serverless Spark; BQML
- Matching Engine – Vertex AI Matching Engine provides the industry’s leading high scale, low latency, vector-similarity matching (also known as approximate nearest neighbor) service, and industry-leading algorithms to train semantic embeddings for similarity-matching use cases.
Data Analytics and BI
- BigLake (Preview)
- Is a storage engine that unifies data lakes and warehouses by providing consistent fine-grained security and performance acceleration across multi-cloud object stores. This extends BigQuery capabilities to open formats & OSS engines
- Provides an API interface that enables customers to query data across GCP & OSS engines, enabling customers to unlock new use cases without writing new infrastructure
- Analytics Hub (Preview) , a fully-managed service built on BigQuery that allows our data sharing partners to efficiently and securely exchange valuable data and analytics assets across any organizational boundary.
- BigQuery Storage Read API (GA), allows OSS Engines to read data from Bigquery Tables
- Bigquery BI Engine SQL Interface (GA), BigQuery BI Engine is a fast, in-memory analysis service that lets users analyze data stored in BigQuery with sub-second query response times and with high concurrency. It is compatible with any BI tool that uses BigQuery.
- Log Analytics Capabilities in Bigquery (Preview) – Log Analytics gives you the analytical power of BigQuery directly in Cloud Logging with a new user interface that’s optimized for analyzing logs data. With Log Analytics, you can use SQL to perform advanced logs analysis.
- Native JSON (Preview) – By using the JSON data type, you can ingest semi-structured JSON into BigQuery without providing a schema for the JSON data upfront. This lets you store and query data that doesn’t always adhere to fixed schemas and data types.
- Search (Preview) – Fully managed search indexes let you pinpoint specific text in data of any size across multiple columns and semi-structured datasets using SQL
Spark on Google Cloud
- First in industry Serverless Spark for All Workloads (GA), Auto-scale without any manual infrastructure provisioning or tuning for Spark. Empowers customers to shift from managing clusters to workloads.
- Pervasive Spark Experience (Preview), Connect, analyze and execute Spark jobs from BigQuery, Vertex AI or Dataplex in 2 clicks, without any custom integrations, using the best of Google-native and Open Source tools.
- Flexibility of Consumptions (GA), One size does not fit all. Choose between Serverless, Google Kubernetes Engine (GKE), and compute clusters for your Spark applications.
- DataFlow Prime (GA): Gain real-time insights and actions with industry leading serverless architecture. This update enabled key features like Vertical Auto Scaling, Right Fitting and Smart Diagnostics
Security and Data Governance
- Dataplex (GA) – unified platform for data management and data governance including features like Catalog, Quality control, Classification, Security, and Lifecycle management.
- Data lineage (Preview) – Record, visualize, and understand the relationships between data assets based on flow of data – mapping out the journey of how data assets are sourced and transformed. This will be part of dataplex.
- Build a secure data warehouse with the new security blueprint
- Data Studio has been moved into the Google Cloud portfolio (from Google Ads). Data studio is part of the Looker team now. It will still be free and it is just being strategically linked and further developed as a part of GCP’s overall BI strategy.
- Looker + Data Studio Integration (GA), Enable Google Data Studio users to read Looker’s Semantic Model
- Connected Sheets for Looker (Preview) – Google sheets to read Looker’s Semantic model. (Looker Enterprise Customer)
- Better data blending in Data Studio – H1 2022
- New filter shelf in Data studio – H1 2022
- Looker for Google Slides – GA – H2 2022
- Google maps in Looker – GA – H2 2022
- Ask Looker – Preview Q2 2022
- Anomaly and Insight Detection – Preview H2 2022
- Improved Looker mobile experience – GA H2 2022
- Looker for Tableau and Power BI – Preview H2 2022
- Looker Marketplace Developer Program – Preview Q1 2022
- Looker integration with data workflow orchestration tools like Apache Airflow or Cloud Composer – GA Q1 2022
- VPC – service controls – GA Q2 2022
Data Analytics Accelerators
- Lot of new updates on Data Analytics Accelerators: this program included 4 pillars including:
- Cloud SQL Insights for MySQL (Coming Soon) – this will provide the similar capabilities like we do have in Cloud SQL for Postgres.
- Cloud Spanner change streams (Coming Soon) – this allows you to replicate database changes in real time and enables easy access and integration with your Analytics Platform.
- Modernize your Oracle workloads to PostgreSQL with Database Migration Service (Preview)
- Migrate from Apache HBase to Cloud Bigtable with Live Migrations (GA)
- Google Cloud Cortex framework: Data foundation for SAP & beyond