https://raw.githubusercontent.com/astronomer/astronomer-cosmos/main/docs/_static/cosmos-logo.svg

fury ossrank downloads pre-commit.ci status

Welcome to Astronomer Cosmos! Whether you’re an experienced data practitioner or just getting started, Cosmos makes it simple to manage and orchestrate your dbt workflows using Apache Airflow®, saving you time and effort. By automatically turning dbt workflows into Airflow DAGs, Cosmos allows you to focus on building high-quality data models without the hassle of managing complex integrations.

To get started right away, please check out our Quickstart Guides. You can also explore more examples in /dev/dags or in the cosmos-demo repo.

To learn more about about Cosmos, please read on.

What Is Astronomer Cosmos?#

Astronomer Cosmos is an open-source library that bridges Apache Airflow and dbt, allowing you to easily transform your dbt projects into Airflow DAGs and manage everything seamlessly. With Cosmos, you can write your data transformations using dbt and then schedule and orchestrate them with Airflow, making the entire process smooth and straightforward.

Why Cosmos? Integrating dbt and Airflow can be complex, but Cosmos simplifies it by seamlessly connecting these powerful tools—letting you focus on what matters most: delivering impactful data models and results without getting bogged down by technical challenges.

Why Should You Use Cosmos?#

Cosmos makes orchestrating dbt workflows:

  • Effortless: Transform your dbt projects into Airflow DAGs without writing extra code—Cosmos handles the heavy lifting.

  • Reliable: Rely on Airflow’s robust scheduling and monitoring features to ensure your dbt workflows run smoothly and efficiently.

  • Scalable: Easily scale your workflows to match growing data demands, thanks to Airflow’s distributed capabilities.

Whether you’re handling intricate data tasks or looking to streamline your processes, Cosmos helps you orchestrate dbt with Airflow effortlessly, saving you time and letting you focus on what truly matters—creating impactful insights.

Example Usage: Jaffle Shop Project#

Let’s explore a practical example to see how Cosmos can convert the dbt workflow into an Airflow DAG.

The jaffle_shop project is a sample dbt project that simulates an e-commerce store’s data. The project includes a series of dbt models that transform raw data into structured tables, such as sales, customers, and products.

Below, you can see what the original dbt workflow looks like in a lineage graph. This graph helps illustrate the relationships between different models:

_images/jaffle_shop_dbt_graph.png

Cosmos can take this dbt workflow and convert it into an Airflow DAG, allowing you to leverage Airflow’s scheduling and orchestration capabilities.

To convert this dbt workflow into an Airflow DAG, create a new DAG definition file, import DbtDag from the Cosmos library, and fill in a few parameters, such as the dbt project directory path and the profile name:

basic_cosmos_dag = DbtDag(
    # dbt/cosmos-specific parameters
    project_config=ProjectConfig(
        DBT_ROOT_PATH / "jaffle_shop",
    ),
    profile_config=profile_config,
    operator_args={
        "install_deps": True,  # install any necessary dependencies before running any dbt command
        "full_refresh": True,  # used only in dbt commands that support this flag
    },
    # normal dag parameters
    schedule_interval="@daily",
    start_date=datetime(2023, 1, 1),
    catchup=False,
    dag_id="basic_cosmos_dag",
    default_args={"retries": 2},
)

This code snippet will generate an Airflow DAG that looks like this:

https://raw.githubusercontent.com/astronomer/astronomer-cosmos/main/docs/_static/jaffle_shop_dag.png

DbtDag is a custom DAG generator that converts dbt projects into Airflow DAGs and accepts Cosmos-specific args like fail_fast to immediately fail a dag if dbt fails to process a resource, or cancel_query_on_kill to cancel any running queries if the task is externally killed or manually set to failed in Airflow. DbtDag also accepts standard DAG arguments such as max_active_tasks, max_active_runs and default_args.

With Cosmos, transitioning from a dbt workflow to a proper Airflow DAG is seamless, giving you the best of both tools for managing and scaling your data workflows.

Changelog#

We follow Semantic Versioning for releases. Refer to CHANGELOG.rst for the latest changes.

Join the Community#

Have questions, need help, or interested in contributing? We welcome all contributions and feedback!

  • Join the community on Slack! You can find us in the Airflow Slack workspace #airflow-dbt channel. If you don’t have an account, click here to sign up.

  • Report bugs, request features, or ask questions by creating an issue in the GitHub repository.

  • Want to contribute new features, bug fixes or documentation enhancements? Please refer to our Contributing Guide.

  • Check out this link. to learn more about our current contributors

Note that contributors and maintainers are expected to abide by the Contributor Code of Conduct.

License#

Apache License 2.0

Privacy Notice#

This project follows Astronomer’s Privacy Policy