Frequently asked questions#

This page collects common questions about using Astronomer Cosmos.

Can I run a combination of dbt Core and dbt Cloud in the same Apache Airflow® deployment?#

Yes. dbt Core (via Cosmos) and dbt Cloud (now dbt Platform) can run side by side in the same Apache Airflow® deployment without conflict.

  • dbt Core is orchestrated with Cosmos, typically via DbtDag or DbtTaskGroup instances.

  • dbt Cloud is orchestrated by Airflow’s official apache-airflow-providers-dbt-cloud provider (for example, DbtCloudRunJobOperator and DbtCloudJobRunSensor), which triggers and monitors jobs defined in dbt Cloud.

This gives you full flexibility:

  • Different DAGs: Cosmos-rendered DAGs for dbt Core projects and plain Airflow DAGs using dbt Cloud operators for dbt Platform projects, all scheduled in the same Airflow deployment.

  • The same DAG: mix and match — for example, a dbt Cloud job triggers first via DbtCloudRunJobOperator, then a Cosmos DbtTaskGroup runs a downstream dbt Core project against the resulting tables (or vice-versa).

  • Data-aware handoffs: a Cosmos task can emit an Airflow Dataset (Airflow 2) or Asset (Airflow 3) on completion that triggers a dbt Cloud DAG, or vice-versa. The underlying concept is the same; the name changed from Dataset to Asset in Airflow 3.

Are there any Airflow 3 features that pair particularly well with Cosmos and dbt?#

Yes — two highlights:

  • dbt docs plugin (rebuilt for Airflow 3): Cosmos uses Airflow 3’s overhauled FastAPI plugin model and supports rendering docs for multiple dbt projects in the same Airflow UI. Requires Airflow ≥ 3.1. See Hosting Docs.

  • Data-aware scheduling (Datasets / Assets): Cosmos automatically emits an Airflow Dataset (Airflow 2) or Asset (Airflow 3) for each dbt model it runs, so downstream DAGs can be triggered when the model is updated — no time-based polling or cross-DAG sensors required. See Scheduling.

How can I reuse dbt artifacts across Cosmos tasks?#

The recommended pattern is to build the dbt artifacts once at deployment time and ship them alongside Cosmos, rather than regenerating them on every task run.

Run the following commands ahead of time — for example, by baking the result into your container image, or via astro dbt deploy on Astro:

dbt deps
dbt ls

These commands produce the artifacts that Cosmos can reuse:

  • manifest.json — pass it to Cosmos via ProjectConfig(manifest_path=...) with LoadMode.DBT_MANIFEST to skip dbt ls at DAG parse time. See Parsing Methods.

  • partial_parse.msgpack — Cosmos automatically picks this up from the dbt project’s target directory to speed up both DAG parsing and task execution. See Partial parsing.

  • dbt_packages/ — pre-installing dbt packages avoids running dbt deps on each task.

How do I decide which dbt models to group in a DbtDag or DbtTaskGroup?#

There isn’t a one-size-fits-all answer — the right split depends on what your team optimises for. A few useful axes to consider:

  • Project organisation / folder structure. If your dbt project is already organised by domain (marts/finance, marts/marketing, etc.), the lowest-friction option is to mirror that in Cosmos. RenderConfig(group_nodes_by_folder=True) automatically creates one TaskGroup per folder. This is a strong default when the project structure already reflects how the team thinks. See Render Config.

  • Tags and selectors. When the folder layout doesn’t match ownership or scheduling needs, tag-based selection (for example, select=["tag:hourly"] or select=["tag:finance"]) gives you finer control. Creating multiple Cosmos DAGs or TaskGroups, each scoped to a selector, lets different schedules and ownership boundaries coexist cleanly. See Selecting & Excluding.

  • Schedule and freshness requirements. Models that need to run hourly shouldn’t be tied to a daily DAG just because they live in the same folder. When cadence varies, splitting by schedule is often the clearest signal — even if it introduces some duplication in lineage.

  • Ownership and on-call. If different teams own different parts of the project, aligning DAG boundaries with those ownership lines simplifies failure routing, retries, and SLAs. Cosmos task callbacks can then map directly to the owning team’s alerting.

  • Criticality / SLAs. Isolating mission-critical models into their own DAGs (with stricter retries, alerting, and tag:prod_critical selectors) helps protect production reliability from noisier or experimental workloads.

  • Resource profile. Grouping heavy or long-running models together lets you assign dedicated pools, queues, or larger Kubernetes pods (in KUBERNETES or WATCHER_KUBERNETES modes) without over-provisioning the rest of the project.

  • Cross-project dependencies. If you’re working with multiple dbt projects, Cosmos supports this natively. Treat each project as its own DAG or TaskGroup and define explicit dependencies between them, rather than forcing everything into a single mono-DAG. See Multi-project setups.

How can I fetch the artifacts generated by Cosmos tasks and run custom logic on top of them?#

Use Cosmos callbacks. A callback is a function Cosmos runs as part of task execution, before the dbt target folder is cleaned up — so it has direct access to the artifacts dbt produced (manifest.json, run_results.json, catalog.json, sources.json, compiled SQL, etc.).

Common things you can do from a callback:

  • Read run_results.json to extract failing nodes, timings, or row counts.

  • Upload artifacts to object storage (S3, GCS, Azure WASB) — Cosmos ships built-in helpers for this in cosmos/io.py.

  • Log or archive compiled SQL for audit or debugging.

  • Trigger follow-up logic such as Snowflake queries, alerts, or downstream notifications.

See Callbacks for the full callback API, built-in helpers, and end-to-end examples.

Can my dbt command use Airflow parameters or variables computed at run time?#

Yes. Although Airflow does not render templated fields during DAG parsing, Cosmos resolves them at task execution time. To opt in, pass the templated values via operator_args.

The fields that support Airflow templating via DbtDag(operator_args=...) or DbtTaskGroup(operator_args=...) are env, vars, full_refresh, and dbt_cmd_flags. select, selector, and exclude are also templatable when passed via operator_args (or directly to a standalone operator instance), with the caveat below.

Note

Templating select, selector, or exclude via operator_args affects only the dbt command each task runs at execution time — it does not change which Airflow tasks Cosmos creates. The task graph is built during DAG parsing from RenderConfig, whose own select / selector / exclude fields are not templatable for the same reason: node selection must complete before Airflow renders templates. In practice, every dbt node still becomes an Airflow task; the templated selector simply narrows what each task tells dbt to process at run time.

For the full list of template fields and caveats, see Operator arguments.

Example: passing Airflow date-aware Jinja templates as dbt vars via DbtDag and operator_args:

from datetime import datetime

from cosmos import DbtDag, ExecutionConfig, ProfileConfig, ProjectConfig

jaffle_shop_dated = DbtDag(
    project_config=ProjectConfig("/usr/local/airflow/dags/dbt/jaffle_shop"),
    profile_config=ProfileConfig(...),
    execution_config=ExecutionConfig(...),
    operator_args={
        "vars": {
            "run_start_date": "{{ data_interval_start | ds }}",
            "run_end_date": "{{ data_interval_end | ds }}",
        },
    },
    schedule="@daily",
    start_date=datetime(2026, 1, 1),
    catchup=False,
    dag_id="jaffle_shop_dated",
)

At task execution time, Airflow renders the templates and Cosmos forwards the resolved values to dbt as --vars, so each run uses the corresponding execution window.

Note

Templated vars and env are not used when Cosmos parses the DAG with LoadMode.DBT_LS. If the values need to influence DAG rendering (for example, to drive node selection), set them on ProjectConfig.dbt_vars instead.

Note

This FAQ is a work in progress. If your question is not answered here, please open an issue in the GitHub repository or ask in the #airflow-dbt channel of the Apache Airflow® Slack.