Kubernetes Execution Mode#

The following tutorial illustrates how to run the Cosmos dbt Kubernetes Operator using a local K8s cluster. It assumes the following:

  • Postgres is run in the Kubernetes (K8s) cluster as a container

  • Airflow is run locally, and it triggers a K8s Pod which runs dbt

Requirements#

To test the DbtKubernetesOperators locally, we encourage you to install the following:

  • Local Airflow (either standalone or using Astro CLI)

  • Kind to run K8s locally

  • Helm to install Postgres in K8s

  • Docker to create the dbt container image, which will allow Airflow to create a K8s pod which will run dbt

At the moment, the user is expected to add to the Docker image both:

  • The dbt project files

  • The dbt Profile which contains the information for dbt to access the database

  • Handle secrets

Additional KubernetesPodOperator parameters can be added on the operator_args parameter of the DbtKubernetesOperator.

For instance,

    run_models = DbtTaskGroup(
        project_config=ProjectConfig(),
        profile_config=ProfileConfig(
            profile_name="postgres_profile",
            target_name="dev",
            profile_mapping=PostgresUserPasswordProfileMapping(
                conn_id="postgres_default",
                profile_args={
                    "schema": "public",
                },
            ),
        ),
        render_config=RenderConfig(dbt_project_path=AIRFLOW_PROJECT_DIR),
        execution_config=ExecutionConfig(execution_mode=ExecutionMode.KUBERNETES, dbt_project_path=K8S_PROJECT_DIR),
        operator_args={
            "image": DBT_IMAGE,
            "get_logs": True,
            "is_delete_operator_pod": False,
            "secrets": [postgres_password_secret, postgres_host_secret],
        },
    )

Step-by-step instructions#

Using installed Kind, you can setup a local kubernetes cluster

kind create cluster

Deploy a Postgres pod to Kind using Helm

helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update
helm install postgres bitnami/postgresql

Retrieve the Postgres password and set it as an environment variable

export POSTGRES_PASSWORD=$(kubectl get secret --namespace default postgres-postgresql -o jsonpath="{.data.postgres-password}" | base64 -d)

Check that the environment variable was set and that it is not empty

echo $POSTGRES_PASSWORD

Expose the Postgres to the host running Docker/Kind

kubectl port-forward --namespace default postgres-postgresql-0  5432:5432

Check that you’re able to connect to the exposed pod

PGPASSWORD="$POSTGRES_PASSWORD" psql --host 127.0.0.1 -U postgres -d postgres -p 5432

postgres=# \dt
\q

Create a K8s secret which contains the credentials to access Postgres

kubectl create secret generic postgres-secrets --from-literal=host=postgres-postgresql.default.svc.cluster.local --from-literal=password=$POSTGRES_PASSWORD

Clone the example repo that contains the Airflow DAG and dbt project files

git clone https://github.com/astronomer/cosmos-example.git
cd cosmos-example/

Create a docker image containing the dbt project files and dbt profile by using the Dockerfile, which will be run in K8s.

docker build -t dbt-jaffle-shop:1.0.0 -f Dockerfile.postgres_profile_docker_k8s .

Note

If running on M1, you may need to set the following envvars for running the docker build command in case it fails

export DOCKER_BUILDKIT=0
export COMPOSE_DOCKER_CLI_BUILD=0
export DOCKER_DEFAULT_PLATFORM=linux/amd64

Take a read of the Dockerfile to understand what it does so that you could use it as a reference in your project.

  • The dbt profile file is added to the image

  • The dags directory containing the dbt project jaffle_shop is added to the image

  • The dbt_project.yml is replaced with postgres_profile_dbt_project.yml which contains the profile key pointing to postgres_profile as profile creation is not handled at the moment for K8s operators like in local mode.

Make the build image available in the Kind K8s cluster

kind load docker-image dbt-jaffle-shop:1.0.0

Create a Python virtual environment and install the latest version of Astronomer Cosmos which contains the K8s Operator

python -m venv venv
source venv/bin/activate
pip install --upgrade pip
pip install "astronomer-cosmos[dbt-postgres]"

Copy the dags directory from cosmos-example repo to your Airflow home

cp -r dags $AIRFLOW_HOME/

Run Airflow

airflow standalone

Note

You might need to run airflow standalone with sudo if your Airflow user is not able to access the docker socket URL or pull the images in the Kind cluster.

Log in to Airflow through a web browser http://localhost:8080/, using the user airflow and the password described in the standalone_admin_password.txt file.

Enable and trigger a run of the jaffle_shop_k8s DAG. You will be able to see the following successful DAG run.

https://github.com/astronomer/astronomer-cosmos/raw/main/docs/_static/jaffle_shop_k8s_dag_run.png