Cosmos Contributing Guide#

All contributions, bug reports, bug fixes, documentation improvements and enhancements are welcome.

As contributors and maintainers to this project, you are expected to abide by the Contributor Code of Conduct.

Learn more about the contributors’ roles in Contributor roles.

Overview#

To contribute to the Cosmos project:

  1. Please create a GitHub Issue describing your contribution.

  2. Fork the repository and clone your fork locally.

  3. Open a feature branch off of the main branch in your fork.

  4. Make your changes, push the branch to your fork, and open a Pull Request from your feature branch into the main branch of the upstream repository.

  5. Link your issue to the pull request.

  6. Once developments are complete on your feature branch, request a review and it will be merged once approved.

Setting up the Cosmos development environment#

Set up local development on the host machine#

This guide will set up Astronomer development on the host machine, first clone the astronomer-cosmos repo and enter the repo directory:

git clone https://github.com/astronomer/astronomer-cosmos.git
cd astronomer-cosmos/

Then install airflow and astronomer-cosmos using python-venv:

python3 -m venv env && source env/bin/activate
pip3 install "apache-airflow[cncf.kubernetes,openlineage]"
pip3 install -e ".[dbt-postgres,dbt-databricks]"

Set Airflow home to the dev/ directory and disable loading example DAGs:

export AIRFLOW_HOME=$(pwd)/dev/
export AIRFLOW__CORE__LOAD_EXAMPLES=false

Then, run Airflow in standalone mode. The command below will create a new user (if one does not exist) and run the necessary Airflow components (webserver, scheduler and triggerer):

By default, Airflow will use SQLite as its database. You can overwrite this by setting the variable AIRFLOW__DATABASE__SQL_ALCHEMY_CONN to the SQL connection string.

airflow standalone

Once Airflow is up, you can access the Airflow UI at http://localhost:8080.

Note: whenever you want to start the development server, you need to activate the virtualenv and set the environment variables.

Using Docker Compose for local development#

It is also possible to just build the development environment using Docker Compose.

To launch a local sandbox with Docker Compose, first clone the astronomer-cosmos repo and enter the repo directory:

git clone https://github.com/astronomer/astronomer-cosmos.git
cd astronomer-cosmos/

To prevent permission errors on Linux, you must create dags, logs, and plugins folders and change owner to the user astro with the user ID 50000. To do this, run the following command:

mkdir -p dev/dags dev/logs dev/plugins
sudo chown 50000:50000 -R dev/dags dev/logs dev/plugins

Then, run the Docker Compose command:

docker compose -f dev/docker-compose.yaml up -d --build

Once the sandbox is up, you can access the Airflow UI at http://localhost:8080.

Working with Hatch#

Hatch is a unified command-line tool for managing Python dependencies and environment isolation. In Cosmos, we use it for building, distributing, running tests, and building documentation.

If you don’t already have Hatch installed, install it before proceeding. As an example, on macOS, you can do so with:

brew install hatch

The pyproject.toml file defines a matrix of supported versions of Python, Airflow and dbt-core for which a user can run the tests against.

Testing the application with Hatch#

After following the steps described in Working with Hatch, you are ready to run Cosmos tests locally. For instance, to run the tests using Python 3.11, Apache Airflow® 2.10 and dbt-core 1.9, use the following:

hatch run tests.py3.11-2.10-1.9:test-cov

It is also possible to run the tests using all the matrix combinations, by using:

hatch run tests:test-cov

The integration tests rely on Postgres. It is possible to host Postgres by using Docker, for example:

docker run --name postgres -p 5432:5432 -p 5433:5433 -e POSTGRES_PASSWORD=postgres postgres

To run the integration tests for the first time, use:

export AIRFLOW_HOME=`pwd`
export AIRFLOW_CONN_AIRFLOW_DB=postgres://postgres:postgres@0.0.0.0:5432/postgres
export DATABRICKS_HOST=''
export DATABRICKS_TOKEN=''
export DATABRICKS_WAREHOUSE_ID=''
export DATABRICKS_CLUSTER_ID=''
export POSTGRES_PORT=5432
export POSTGRES_SCHEMA=public
export POSTGRES_DB=postgres
export POSTGRES_PASSWORD=postgres
export POSTGRES_USER=postgres
export POSTGRES_HOST=localhost
hatch run tests.py3.11-2.10-1.9:test-cov:test-integration-setup
hatch run tests.py3.11-2.10-1.9:test-cov:test-integration

If testing for the same Airflow and Python version, next runs of the integration tests can be:

hatch run tests.py3.11-2.10-1.9:test-integration

Writing Docs#

After following the steps described in Working with Hatch, you are ready to build and serve the documentation locally.

You can run the docs locally by running the following:

hatch run docs:serve

Building#

After following the steps described in Working with Hatch, you are ready to build the project.

To build the project, run:

hatch build

Releasing#

Note

This section is intended for Cosmos maintainers only.

We use GitHub Actions to create and deploy new releases. To create a new release, first create a new version using:

hatch version minor

Hatch will automatically update the version for you. Then, create a new release on GitHub with the new version. The release will be automatically deployed to PyPI.

Note

You can update the version in a few different ways. Check out the Hatch docs to learn more.

pre-commit#

We use pre-commit to run a number of checks on the code before committing. To install pre-commit, run:

pre-commit install

To run the checks manually, run:

pre-commit run --all-files