Frequently asked questions#
This page collects common questions about using Astronomer Cosmos.
Can I run a combination of dbt Core and dbt Cloud in the same Apache Airflow® deployment?#
Yes. dbt Core (via Cosmos) and dbt Cloud (now dbt Platform) can run side by side in the same Apache Airflow® deployment without conflict.
dbt Core is orchestrated with Cosmos, typically via
DbtDagorDbtTaskGroupinstances.dbt Cloud is orchestrated by Airflow’s official apache-airflow-providers-dbt-cloud provider (for example,
DbtCloudRunJobOperatorandDbtCloudJobRunSensor), which triggers and monitors jobs defined in dbt Cloud.
This gives you full flexibility:
Different DAGs: Cosmos-rendered DAGs for dbt Core projects and plain Airflow DAGs using dbt Cloud operators for dbt Platform projects, all scheduled in the same Airflow deployment.
The same DAG: mix and match — for example, a dbt Cloud job triggers first via
DbtCloudRunJobOperator, then a CosmosDbtTaskGroupruns a downstream dbt Core project against the resulting tables (or vice-versa).Data-aware handoffs: a Cosmos task can emit an Airflow Dataset (Airflow 2) or Asset (Airflow 3) on completion that triggers a dbt Cloud DAG, or vice-versa. The underlying concept is the same; the name changed from Dataset to Asset in Airflow 3.
Are there any Airflow 3 features that pair particularly well with Cosmos and dbt?#
Yes — two highlights:
dbt docs plugin (rebuilt for Airflow 3): Cosmos uses Airflow 3’s overhauled FastAPI plugin model and supports rendering docs for multiple dbt projects in the same Airflow UI. Requires Airflow ≥ 3.1. See Hosting Docs.
Data-aware scheduling (Datasets / Assets): Cosmos automatically emits an Airflow Dataset (Airflow 2) or Asset (Airflow 3) for each dbt model it runs, so downstream DAGs can be triggered when the model is updated — no time-based polling or cross-DAG sensors required. See Scheduling.
How can I reuse dbt artifacts across Cosmos tasks?#
The recommended pattern is to build the dbt artifacts once at deployment time and ship them alongside Cosmos, rather than regenerating them on every task run.
Run the following commands ahead of time — for example, by baking the
result into your container image, or via astro dbt deploy on Astro:
dbt deps
dbt ls
These commands produce the artifacts that Cosmos can reuse:
manifest.json— pass it to Cosmos viaProjectConfig(manifest_path=...)withLoadMode.DBT_MANIFESTto skipdbt lsat DAG parse time. See Parsing Methods.partial_parse.msgpack— Cosmos automatically picks this up from the dbt project’stargetdirectory to speed up both DAG parsing and task execution. See Partial parsing.dbt_packages/— pre-installing dbt packages avoids runningdbt depson each task.
How do I decide which dbt models to group in a DbtDag or DbtTaskGroup?#
There isn’t a one-size-fits-all answer — the right split depends on what your team optimises for. A few useful axes to consider:
Project organisation / folder structure. If your dbt project is already organised by domain (
marts/finance,marts/marketing, etc.), the lowest-friction option is to mirror that in Cosmos.RenderConfig(group_nodes_by_folder=True)automatically creates one TaskGroup per folder. This is a strong default when the project structure already reflects how the team thinks. See Render Config.Tags and selectors. When the folder layout doesn’t match ownership or scheduling needs, tag-based selection (for example,
select=["tag:hourly"]orselect=["tag:finance"]) gives you finer control. Creating multiple Cosmos DAGs or TaskGroups, each scoped to a selector, lets different schedules and ownership boundaries coexist cleanly. See Selecting & Excluding.Schedule and freshness requirements. Models that need to run hourly shouldn’t be tied to a daily DAG just because they live in the same folder. When cadence varies, splitting by schedule is often the clearest signal — even if it introduces some duplication in lineage.
Ownership and on-call. If different teams own different parts of the project, aligning DAG boundaries with those ownership lines simplifies failure routing, retries, and SLAs. Cosmos task callbacks can then map directly to the owning team’s alerting.
Criticality / SLAs. Isolating mission-critical models into their own DAGs (with stricter retries, alerting, and
tag:prod_criticalselectors) helps protect production reliability from noisier or experimental workloads.Resource profile. Grouping heavy or long-running models together lets you assign dedicated pools, queues, or larger Kubernetes pods (in
KUBERNETESorWATCHER_KUBERNETESmodes) without over-provisioning the rest of the project.Cross-project dependencies. If you’re working with multiple dbt projects, Cosmos supports this natively. Treat each project as its own DAG or TaskGroup and define explicit dependencies between them, rather than forcing everything into a single mono-DAG. See Multi-project setups.
How can I fetch the artifacts generated by Cosmos tasks and run custom logic on top of them?#
Use Cosmos callbacks. A callback is a function Cosmos runs as part of
task execution, before the dbt target folder is cleaned up — so it has
direct access to the artifacts dbt produced (manifest.json,
run_results.json, catalog.json, sources.json, compiled SQL,
etc.).
Common things you can do from a callback:
Read
run_results.jsonto extract failing nodes, timings, or row counts.Upload artifacts to object storage (S3, GCS, Azure WASB) — Cosmos ships built-in helpers for this in
cosmos/io.py.Log or archive compiled SQL for audit or debugging.
Trigger follow-up logic such as Snowflake queries, alerts, or downstream notifications.
See Callbacks for the full callback API, built-in helpers, and end-to-end examples.
Can my dbt command use Airflow parameters or variables computed at run time?#
Yes. Although Airflow does not render templated fields during DAG parsing,
Cosmos resolves them at task execution time. To opt in, pass the
templated values via operator_args.
The fields that support Airflow templating via
DbtDag(operator_args=...) or DbtTaskGroup(operator_args=...) are
env, vars, full_refresh, and dbt_cmd_flags. select,
selector, and exclude are also templatable when passed via
operator_args (or directly to a standalone operator instance), with
the caveat below.
Note
Templating select, selector, or exclude via
operator_args affects only the dbt command each task runs at
execution time — it does not change which Airflow tasks Cosmos
creates. The task graph is built during DAG parsing from
RenderConfig, whose own select / selector / exclude
fields are not templatable for the same reason: node selection
must complete before Airflow renders templates. In practice, every
dbt node still becomes an Airflow task; the templated selector simply
narrows what each task tells dbt to process at run time.
For the full list of template fields and caveats, see Operator arguments.
Example: passing Airflow date-aware Jinja templates as dbt vars via
DbtDag and operator_args:
from datetime import datetime
from cosmos import DbtDag, ExecutionConfig, ProfileConfig, ProjectConfig
jaffle_shop_dated = DbtDag(
project_config=ProjectConfig("/usr/local/airflow/dags/dbt/jaffle_shop"),
profile_config=ProfileConfig(...),
execution_config=ExecutionConfig(...),
operator_args={
"vars": {
"run_start_date": "{{ data_interval_start | ds }}",
"run_end_date": "{{ data_interval_end | ds }}",
},
},
schedule="@daily",
start_date=datetime(2026, 1, 1),
catchup=False,
dag_id="jaffle_shop_dated",
)
At task execution time, Airflow renders the templates and Cosmos forwards
the resolved values to dbt as --vars, so each run uses the
corresponding execution window.
Note
Templated vars and env are not used when Cosmos parses the DAG
with LoadMode.DBT_LS. If the values need to influence DAG rendering
(for example, to drive node selection), set them on
ProjectConfig.dbt_vars instead.
Note
This FAQ is a work in progress. If your question is not answered here,
please open an issue in the
GitHub repository
or ask in the #airflow-dbt channel of the Apache Airflow® Slack.