Salesforce Bulk API

Bulk query jobs for extracting large datasets from Salesforce.

Operations

OperationDescription
createCreate a bulk query job.
getGet job status.
get_resultsDownload query results as Arrow RecordBatch.
abortAbort a running job.
deleteDelete a job.

Configuration

- salesforce_bulkapi_query_job:
    name: export_accounts
    operation: create
    credentials_path: /etc/salesforce/credentials.json
    query: "SELECT Id, Name, Industry FROM Account"

Fields

FieldTypeDefaultDescription
namestringrequiredTask name.
operationstringrequiredcreate, get, get_results, abort, delete.
credentials_pathstringrequiredPath to Salesforce credentials.
querystring/resourceSOQL query (for create).
query_operationstringqueryquery or query_all (includes deleted/archived).
content_typestringcsvOutput format.
column_delimiterstringcommaCSV delimiter: comma, tab, semicolon, pipe.
line_endingstringlfLine ending: lf or crlf.
job_idstringJob ID (for get, get_results, abort, delete). Supports templating.
batch_sizeint10000Rows per Arrow RecordBatch.
has_headerbooltrueFirst row is header.
depends_onlistUpstream task names.
retryobjectRetry configuration.

Example: Create job and get results

flow:
  name: salesforce_export
  tasks:
    - generate:
        name: trigger
        cron: "0 2 * * *"

    - salesforce_bulkapi_query_job:
        name: create_job
        operation: create
        credentials_path: /etc/salesforce/credentials.json
        query: "SELECT Id, Name, Industry FROM Account WHERE LastModifiedDate = TODAY"

    - salesforce_bulkapi_query_job:
        name: get_results
        operation: get_results
        credentials_path: /etc/salesforce/credentials.json
        job_id: "{{event.data.id}}"