.. _faq:
Frequently asked questions
==========================
This page collects common questions about using Astronomer Cosmos.
Can I run a combination of dbt Core and dbt Cloud in the same Apache Airflow® deployment?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Yes. dbt Core (via Cosmos) and dbt Cloud (now dbt Platform) can run side by
side in the same `Apache Airflow® `_ deployment
without conflict.
- **dbt Core** is orchestrated with Cosmos, typically via ``DbtDag`` or
``DbtTaskGroup`` instances.
- **dbt Cloud** is orchestrated by Airflow's official
`apache-airflow-providers-dbt-cloud `_
provider (for example, ``DbtCloudRunJobOperator`` and
``DbtCloudJobRunSensor``), which triggers and monitors jobs defined in
dbt Cloud.
This gives you full flexibility:
- **Different DAGs**: Cosmos-rendered DAGs for dbt Core projects and plain
Airflow DAGs using dbt Cloud operators for dbt Platform projects, all
scheduled in the same Airflow deployment.
- **The same DAG**: mix and match — for example, a dbt Cloud job triggers
first via ``DbtCloudRunJobOperator``, then a Cosmos ``DbtTaskGroup`` runs
a downstream dbt Core project against the resulting tables (or
vice-versa).
- **Data-aware handoffs**: a Cosmos task can emit an Airflow Dataset
(Airflow 2) or Asset (Airflow 3) on completion that triggers a dbt Cloud
DAG, or vice-versa. The underlying concept is the same; the name changed
from Dataset to Asset in Airflow 3.
Are there any Airflow 3 features that pair particularly well with Cosmos and dbt?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Yes — two highlights:
- **dbt docs plugin (rebuilt for Airflow 3)**: Cosmos uses Airflow 3's
overhauled FastAPI plugin model and supports rendering docs for multiple
dbt projects in the same Airflow UI. Requires Airflow ≥ 3.1. See
:doc:`guides/dbt_docs/hosting-docs`.
- **Data-aware scheduling (Datasets / Assets)**: Cosmos automatically
emits an Airflow Dataset (Airflow 2) or Asset (Airflow 3) for each dbt
model it runs, so downstream DAGs can be triggered when the model is
updated — no time-based polling or cross-DAG sensors required. See
:doc:`guides/run_dbt/customization/scheduling`.
How can I reuse dbt artifacts across Cosmos tasks?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The recommended pattern is to build the dbt artifacts **once** at
deployment time and ship them alongside Cosmos, rather than regenerating
them on every task run.
Run the following commands ahead of time — for example, by baking the
result into your container image, or via ``astro dbt deploy`` on Astro:
.. code-block:: bash
dbt deps
dbt ls
These commands produce the artifacts that Cosmos can reuse:
- ``manifest.json`` — pass it to Cosmos via
``ProjectConfig(manifest_path=...)`` with ``LoadMode.DBT_MANIFEST`` to
skip ``dbt ls`` at DAG parse time. See
:doc:`guides/translate_dbt_to_airflow/parsing-methods`.
- ``partial_parse.msgpack`` — Cosmos automatically picks this up from the
dbt project's ``target`` directory to speed up both DAG parsing and task
execution. See :doc:`guides/run_dbt/customization/partial-parsing`.
- ``dbt_packages/`` — pre-installing dbt packages avoids running
``dbt deps`` on each task.
How do I decide which dbt models to group in a ``DbtDag`` or ``DbtTaskGroup``?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
There isn't a one-size-fits-all answer — the right split depends on what
your team optimises for. A few useful axes to consider:
- **Project organisation / folder structure.** If your dbt project is
already organised by domain (``marts/finance``, ``marts/marketing``,
etc.), the lowest-friction option is to mirror that in Cosmos.
``RenderConfig(group_nodes_by_folder=True)`` automatically creates one
TaskGroup per folder. This is a strong default when the project
structure already reflects how the team thinks. See
:doc:`guides/translate_dbt_to_airflow/render-config`.
- **Tags and selectors.** When the folder layout doesn't match ownership
or scheduling needs, tag-based selection (for example,
``select=["tag:hourly"]`` or ``select=["tag:finance"]``) gives you finer
control. Creating multiple Cosmos DAGs or TaskGroups, each scoped to a
selector, lets different schedules and ownership boundaries coexist
cleanly. See :doc:`guides/translate_dbt_to_airflow/selecting-excluding`.
- **Schedule and freshness requirements.** Models that need to run hourly
shouldn't be tied to a daily DAG just because they live in the same
folder. When cadence varies, splitting by schedule is often the clearest
signal — even if it introduces some duplication in lineage.
- **Ownership and on-call.** If different teams own different parts of
the project, aligning DAG boundaries with those ownership lines
simplifies failure routing, retries, and SLAs. Cosmos
:doc:`task callbacks ` can then map
directly to the owning team's alerting.
- **Criticality / SLAs.** Isolating mission-critical models into their
own DAGs (with stricter retries, alerting, and ``tag:prod_critical``
selectors) helps protect production reliability from noisier or
experimental workloads.
- **Resource profile.** Grouping heavy or long-running models together
lets you assign dedicated pools, queues, or larger Kubernetes pods (in
``KUBERNETES`` or ``WATCHER_KUBERNETES`` modes) without over-provisioning
the rest of the project.
- **Cross-project dependencies.** If you're working with multiple dbt
projects, Cosmos supports this natively. Treat each project as its own
DAG or TaskGroup and define explicit dependencies between them, rather
than forcing everything into a single mono-DAG. See
:doc:`guides/multi_project/multi-project`.
How can I fetch the artifacts generated by Cosmos tasks and run custom logic on top of them?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Use Cosmos **callbacks**. A callback is a function Cosmos runs as part of
task execution, before the dbt ``target`` folder is cleaned up — so it has
direct access to the artifacts dbt produced (``manifest.json``,
``run_results.json``, ``catalog.json``, ``sources.json``, compiled SQL,
etc.).
Common things you can do from a callback:
- Read ``run_results.json`` to extract failing nodes, timings, or row counts.
- Upload artifacts to object storage (S3, GCS, Azure WASB) — Cosmos ships
built-in helpers for this in ``cosmos/io.py``.
- Log or archive compiled SQL for audit or debugging.
- Trigger follow-up logic such as Snowflake queries, alerts, or downstream
notifications.
See :doc:`guides/run_dbt/callbacks/callbacks` for the full callback API,
built-in helpers, and end-to-end examples.
Can my dbt command use Airflow parameters or variables computed at run time?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Yes. Although Airflow does not render templated fields during DAG parsing,
Cosmos resolves them **at task execution time**. To opt in, pass the
templated values via ``operator_args``.
The fields that support Airflow templating via
``DbtDag(operator_args=...)`` or ``DbtTaskGroup(operator_args=...)`` are
``env``, ``vars``, ``full_refresh``, and ``dbt_cmd_flags``. ``select``,
``selector``, and ``exclude`` are also templatable when passed via
``operator_args`` (or directly to a standalone operator instance), with
the caveat below.
.. note::
Templating ``select``, ``selector``, or ``exclude`` via
``operator_args`` affects only the **dbt command each task runs at
execution time** — it does not change which Airflow tasks Cosmos
creates. The task graph is built during DAG parsing from
``RenderConfig``, whose own ``select`` / ``selector`` / ``exclude``
fields are **not** templatable for the same reason: node selection
must complete before Airflow renders templates. In practice, every
dbt node still becomes an Airflow task; the templated selector simply
narrows what each task tells dbt to process at run time.
For the full list of template fields and caveats, see
:doc:`guides/run_dbt/operators/operator-args`.
Example: passing Airflow date-aware Jinja templates as dbt ``vars`` via
``DbtDag`` and ``operator_args``:
.. code-block:: python
from datetime import datetime
from cosmos import DbtDag, ExecutionConfig, ProfileConfig, ProjectConfig
jaffle_shop_dated = DbtDag(
project_config=ProjectConfig("/usr/local/airflow/dags/dbt/jaffle_shop"),
profile_config=ProfileConfig(...),
execution_config=ExecutionConfig(...),
operator_args={
"vars": {
"run_start_date": "{{ data_interval_start | ds }}",
"run_end_date": "{{ data_interval_end | ds }}",
},
},
schedule="@daily",
start_date=datetime(2026, 1, 1),
catchup=False,
dag_id="jaffle_shop_dated",
)
At task execution time, Airflow renders the templates and Cosmos forwards
the resolved values to dbt as ``--vars``, so each run uses the
corresponding execution window.
.. note::
Templated ``vars`` and ``env`` are not used when Cosmos parses the DAG
with ``LoadMode.DBT_LS``. If the values need to influence DAG rendering
(for example, to drive node selection), set them on
``ProjectConfig.dbt_vars`` instead.
.. note::
This FAQ is a work in progress. If your question is not answered here,
please open an issue in the
`GitHub repository `_
or ask in the ``#airflow-dbt`` channel of the Apache Airflow® Slack.