BigQuery Jobs

Create, monitor, and manage BigQuery jobs. Used for load jobs from GCS.

Operations

Operation	Description
`create`	Create a load job.
`get`	Get job status.
`cancel`	Cancel a running job.
`delete`	Delete a job.

Configuration

- gcp_bigquery_job:
    name: load_from_gcs
    operation: create
    credentials_path: /etc/gcp/service-account.json
    project_id: my-project
    source_uris:
      - "gs://my-bucket/data/*.parquet"
    destination_table:
      project_id: my-project
      dataset_id: raw
      table_id: events
    source_format: parquet
    write_disposition: write_append
    autodetect: true

Fields

Field	Type	Default	Description
`name`	string	required	Task name.
`operation`	string	required	`create`, `get`, `cancel`, `delete`.
`credentials_path`	string	required	GCP service account credentials.
`project_id`	string	required	GCP project ID.
`location`	string		BigQuery location.
`source_uris`	list/template		GCS source URIs (for create).
`destination_table`	object		Target table (`project_id`, `dataset_id`, `table_id`).
`source_format`	string	`newline_delimited_json`	`parquet`, `csv`, `newline_delimited_json`, `avro`.
`write_disposition`	string	`write_append`	`write_append`, `write_truncate`, `write_empty`.
`create_disposition`	string	`create_if_needed`	`create_if_needed`, `create_never`.
`autodetect`	bool		Auto-detect schema from source.
`schema`	list		Explicit schema (list of field definitions).
`max_bad_records`	int		Max bad records before job fails.
`job_id`	string		Job ID (for get, cancel, delete). Supports templating.
`poll_interval`	duration	`5s`	Status check interval.
`max_poll_duration`	duration	`30m`	Max time to wait for completion.
`labels`	map		Job labels.
`depends_on`	list		Upstream task names.
`retry`	object		Retry configuration.