Troubleshooting Performance#

This page helps you diagnose common performance issues when running Cosmos.

Measuring DAG parse time#

Cosmos logs the time it takes to parse each dbt project at the INFO level. Search your Airflow scheduler or DAG processor logs for messages like:

Cosmos performance (my_dbt_dag) - [worker-1|12345]: It took 0.068s to parse the dbt project for DAG using LoadMode.DBT_LS_CACHE

This tells you:

  • Which DAG was parsed (my_dbt_dag)

  • Which node and process performed the parse

  • How long parsing took (0.068s)

  • Which LoadMode was used (LoadMode.DBT_LS_CACHE)

If the parse time is high, see Optimize DAG Parsing for strategies to reduce it.

Airflow DAG processor configuration#

The Airflow DAG processor controls how often and how quickly DAG files are parsed. Understanding these settings helps you diagnose issues where DAGs are slow to appear or update.

What it controls

Airflow 3 setting

Default

Airflow 2 setting

How often new DAG files are detected

[dag_processor] refresh_interval

5 min

[scheduler] dag_dir_list_interval

Minimum interval between reparses of the same file

[dag_processor] min_file_process_interval

30s

[scheduler] min_file_process_interval

Number of concurrent parsing processes

[dag_processor] parsing_processes

2

[scheduler] max_threads

Maximum time to parse a single DAG file

[dag_processor] dag_file_processor_timeout

50s

[core] dag_file_processor_timeout

Maximum time to import a single Python file

[dag_processor] dagbag_import_timeout

30s

[core] dagbag_import_timeout

Note

In Airflow 3, DAG processor settings moved from the [core] and [scheduler] sections to the [dag_processor] section. Environment variable overrides follow the same pattern: AIRFLOW__DAG_PROCESSOR__DAGBAG_IMPORT_TIMEOUT instead of AIRFLOW__CORE__DAGBAG_IMPORT_TIMEOUT.

DAGs not appearing#

If your DbtDag or DbtTaskGroup does not appear in Airflow, the most likely cause is that DAG parsing exceeds the dagbag_import_timeout.

Diagnosis steps:

  1. Check the Cosmos parse time log (see above). If the time exceeds the dagbag_import_timeout (default 30s), the DAG file will be silently dropped.

  2. Check the Airflow DAG processor logs for timeout or import errors related to your DAG file.

Solutions (in order of preference):

  1. Reduce the parse time by following the recommendations in Optimize DAG Parsing. The most impactful change is switching to LoadMode.DBT_MANIFEST.

  2. Increase the timeout only if reducing parse time is not sufficient:

    # Airflow 3
    export AIRFLOW__DAG_PROCESSOR__DAGBAG_IMPORT_TIMEOUT=120
    
    # Airflow 2
    export AIRFLOW__CORE__DAGBAG_IMPORT_TIMEOUT=120
    

OOM errors during task execution#

If tasks are being killed by the OS or reaching a zombie state, they are likely running out of memory.

Diagnosis steps:

  1. Enable Cosmos debug mode to measure peak memory usage per task:

    export AIRFLOW__COSMOS__ENABLE_DEBUG_MODE=True
    

    After tasks run, check the cosmos_debug_max_memory_mb XCom value to identify memory-hungry tasks.

  2. Check if dbt and Airflow are in separate Python environments. Running dbt as a subprocess (InvocationMode.SUBPROCESS) roughly doubles memory usage compared to running it as a library (InvocationMode.DBT_RUNNER).

Solutions:

  • Enable memory-optimized imports to reduce the Cosmos memory footprint:

    export AIRFLOW__COSMOS__ENABLE_MEMORY_OPTIMISED_IMPORTS=True
    

    Note: when enabled, use full module paths for imports (e.g., from cosmos.airflow.dag import DbtDag instead of from cosmos import DbtDag). See Memory Optimization Options for Astronomer Cosmos for details.

  • Use ExecutionMode.WATCHER to run dbt in a single process with threading instead of spawning a separate process per model. See Watcher execution mode (Experimental).

  • Route memory-intensive tasks to workers with more resources using Airflow pools or the watcher_dbt_execution_queue configuration.

For a comprehensive list of memory optimization options, see Memory Optimization Options for Astronomer Cosmos.