Overview
- Access engine for data in BigQuery (data warehouse) and Cloud Storage (data lake)
- Single interface/semantics irrespective of storage/format
- Expects GCS data to be structured, e.g. CSV, JSON, Parquet, ORC, Avro
Process
- Create table in BigQuery—point to GCS as underlying data source
- Can now query BigQuery interface to access GCS data—storage abstract from consumer
Alternatives
- BigQuery External Tables
- Differences:
- BigLake allows fine-grained access control at row/column level
- External tables—user performing query needs authorization to underlying data e.g. GCS bucket
- With BigLake, it’s BigLake that has access authorization