Orchestration

EDP needs orchestration for connector jobs, scheduled ingestion, transformation runs, data quality checks, publication workflows, dashboard refresh dependencies, and recovery after failures.

The recommended starting point is Apache Airflow paired with dbt.

Primary Recommendation

Use Apache Airflow as the default orchestration tool for early EDP implementations.

Airflow is a strong fit for scheduled, batch-oriented workflows with clear dependencies. It works well for recurring API pulls, raw data landing, ODS refreshes, Data Vault loads, Data Mart builds, quality checks, and operational alerting.

Pair Airflow with dbt for transformation logic, model testing, lineage, and documentation. In this pattern, Airflow coordinates when work happens, while dbt owns how analytical models are built and validated.

Suggested Tool Mapping

Use Case	Recommended Tool
Primary EDP pipeline orchestration	Apache Airflow
Data transformations, tests, and lineage	dbt
Data asset modeling and lineage-heavy orchestration	Dagster
Lightweight Python-first workflows	Prefect
Long-running durable business workflows	Temporal
Kubernetes-native scheduled jobs	Kubernetes CronJob or Argo Workflows

Airflow Responsibilities

Use Airflow for:

Connector job scheduling
Source API extraction workflows
Raw data landing
ODS refreshes
Data Vault loads
Data Mart builds
dbt command orchestration
Data quality checks
Dashboard refresh dependencies
Failure alerting and retry handling
Operational audit records for pipeline runs

dbt Responsibilities

Use dbt for:

SQL transformations
ODS, Data Vault, and Data Mart model definitions
Model tests and data quality assertions
Documentation for models and columns
Lineage between source data and published data products
Repeatable builds across development, test, and production environments

When to Consider Dagster

Consider Dagster when the platform becomes strongly asset-centric. Dagster is compelling when the main mental model is maintaining reliable data assets such as tables, marts, reports, machine learning features, or published datasets.

Dagster may be a better fit when lineage, freshness, asset checks, and data product ownership become more important than traditional scheduled job chains.

When to Consider Prefect

Consider Prefect for smaller Python-first workflows that need less scheduler ceremony or more dynamic runtime behavior than a traditional DAG. It can be useful for lightweight automation and developer-friendly operational scripts.

When to Consider Temporal

Temporal is best reserved for durable business workflows rather than normal data pipelines.

Use Temporal when workflows are long-running, stateful, event-driven, and need strong recovery guarantees. Examples might include multi-step approval processes, remediation workflows, provisioning lifecycles, or operational applications where the workflow itself is part of the business process.

Kubernetes-Native Options

Kubernetes CronJobs can handle simple scheduled container tasks. Argo Workflows can be useful for Kubernetes-native workflow execution.

These tools are helpful for infrastructure-level jobs, but they should not replace the main EDP orchestration layer unless the platform deliberately chooses a Kubernetes-native workflow model.

Starting Architecture

Start with:

Airflow for orchestration
dbt for transformations and model tests
PostgreSQL for Airflow metadata and EDP persistence
Containerized workers for connector and transformation workloads

As the platform matures, consider Kubernetes-based execution for scalable workers and better workload isolation.

Selection Guidance

Start with Airflow plus dbt when the main need is reliable scheduled data movement and transformation.

Choose Dagster when the platform is organized primarily around data assets, freshness, and lineage.

Choose Prefect when workflows are mostly Python functions and developer speed matters more than formal DAG governance.

Choose Temporal when the workflow is a durable operational process, not just a data pipeline.

Orchestration ​

Primary Recommendation ​

Suggested Tool Mapping ​

Airflow Responsibilities ​

dbt Responsibilities ​

When to Consider Dagster ​

When to Consider Prefect ​

When to Consider Temporal ​

Kubernetes-Native Options ​

Starting Architecture ​

Selection Guidance ​