Data Warehousing with BigQuery: Storage Design, Query Optimization, and Administration (DWBQ-SDQA) – Outline
Detailed Course Outline
Module 1 - BigQuery Architecture Fundamentals
Topics
- Introduction
- BigQuery Core Infrastructure
- BigQuery Storage
- BigQuery Query Processing
- BigQuery Data Shuffling
Objectives
- Explain the benefits of columnar storage.
- Understand how BigQuery processes data.
- Explore the basics of BigQuery’s shuffling service to improve query efficiency.
Activities
Module 2 - Storage and Schema Optimizations
Topics
- BigQuery Storage
- Partitioning and Clustering
- Nested and Repeated Fields
- ARRAY and STRUCT syntax
- Best Practices
Objectives
- Compare the performance of different schemas (snowflake, denormalized, and nested and repeated fields).
- Partition and cluster data for better performance
- Improve schema design using nested and repeated fields.
- Describe additional best practices such as table and partition expiration
Activities
Module 3 - Ingesting Data
Topics
- Data Ingestion Options
- Batch Ingestion
- Streaming Ingestion
- Legacy Streaming API
- BigQuery Storage Write API
- Query Materialization
- Query External Data Sources
- Data Transfer Service
Objectives
- Ingest batch and streaming data.
- Query external data sources.
- Schedule data transfers.
- Understand how to use Storage Write API.
Activities
Module 4 - Changing Data
Topics
- Managing Change in Data Warehouses
- Handling Slowly Changing Dimensions (SCD)
- DML statements
- DML Best Practices and Common Issues
Objectives
- Write DML statements.
- Address common DML performance problems and bottlenecks.
- Identify slowly changing dimensions (SCD) in your data and make updates.
Module 5 - Improving Read Performance
Topics
- BigQuery’s Cache
- Materialized Views
- BI Engine
- High Throughput Reads
- BigQuery Storage Read API
Objectives
- Explore BigQuery’s cache.
- Create materialized views.
- Work with BI Engine to accelerate your SQL queries.
- Use the Storage Read API for fast access to BigQuery-managed storage.
- Explain the caveats of using external data sources.
Activities
Module 6 - Optimizing and Troubleshooting Queries
Topics
- Simple Query Execution
- SELECTs and Aggregation
- JOINs and Skewed JOINs
- Filtering and Ordering
- Best Practices for Functions
Objectives
- Interpret BigQuery execution details and the query plan.
- Optimize query performance by using suggested methods for SQL statements and clauses.
- Demonstrate best practices for functions in business use cases.
Activities
Module 7 - Workload Management and Pricing
Topics
- BigQuery Slots
- Pricing Models and Estimates
- Slot Reservations
- Controlling Costs
Objectives
- Define a BigQuery slot.
- Explain pricing models and pricing estimations (BigQuery UI, bq dry_run, jobs API).
- Understand slot reservations, commitments, and assignments.
- Identify best practices to control costs.
Activities
Module 8 - Logging and Monitoring
Topics
- Cloud Monitoring
- BigQuery Admin Panel
- Cloud Audit Logs
- INFORMATION_SCHEMA
- Query Path and Common Errors
Objectives
- Use Cloud Monitoring to view BigQuery metrics.
- Explore the BigQuery admin panel.
- Use Cloud Audit logs.
- Work with INFORMATION_SCHEMA tables to get insights for your BigQuery entities.
Activities
Module 9 - Security in BigQuery
Topics
- Secure Resources with IAM
- Authorized Views
- Secure Data with Classification
- Encryption
- Data Discovery and Governance
Objectives
- Explore data discovery using Data Catalog.
- Discuss data governance using DLP API and Data Catalog.
- Create IAM policies (e.g., authorized views) to secure resources.
- Secure data with classifications (e.g., row-level policies).
- Understand how BigQuery uses encryption.
Activities
Module 10 - Automating Workloads
Topics
- Scheduling Queries
- Scripting
- Stored Procedures
- Integration with Big Data Products
Objectives
- Schedule queries.
- Use scripting and stored procedures to build custom transformations.
- Describe how to integrate BigQuery workloads with other Google Cloud big data products.
Activities
Module 11 - Machine Learning in BigQuery
Topics
- Introduction to BigQuery ML
- How to Make Predictions with BigQuery ML
- How to Build and Deploy a Recommendation System with BigQuery ML
- How to Build and Deploy a Demand Forecasting Solution with BigQuery ML
- Time-Series Models with BigQuery ML
- BigQuery ML Explainability
Objectives
- Describe some of the different applications of BigQuery ML.
- Build and deploy several categories of machine learning models with BigQuery ML.
- Use AutoML Tables to solve high-value business problems.
Activities