Skip to content

Configuring Your Workflows

DAG Factory allows you to define workflows in a structured, configuration-driven way using YAML files. You can define multiple workflows within a single YAML file based on your requirements.

Key Elements of Workflow Configuration

  • dag_id: Unique identifier for your DAG.
  • default_args: Common arguments for all tasks.
  • schedule: Specifies the execution schedule.
  • tasks: Defines the Airflow tasks in your workflow.
  • task_groups: Defines Airflow task groups to organize and group related tasks.

Example DAG Configuration

Task and Task Group Configuration Formats

DAG Factory supports two formats for defining tasks and task_groups:

The list format is the recommended and more readable approach. In this format, tasks are defined as a list where each task includes a task_id field, and task groups are also defined as a list where each group includes a group_name field:

Version Support

List format support was introduced in version 1.0.0.

example_dag_factory.yml
basic_example_dag:
  default_args:
    owner: "custom_owner"
  description: "this is an example dag"
  schedule: "0 3 * * *"
  render_template_as_native_obj: True
  catchup: false
  task_groups:
    - group_name: "example_task_group"
      tooltip: "this is an example task group"
      dependencies: [task_1]
  tasks:
    - task_id: "task_1"
      operator: airflow.operators.bash.BashOperator
      bash_command: "echo 1"
    - task_id: "task_2"
      operator: airflow.operators.bash.BashOperator
      bash_command: "echo 2"
      dependencies: [task_1]
    - task_id: "task_3"
      operator: airflow.operators.bash.BashOperator
      bash_command: "echo 3"
      dependencies: [task_1]
      task_group_name: "example_task_group"

Dictionary Format (Legacy)

The dictionary format is also supported for backward compatibility. In this format, tasks are defined as a dictionary where the key is the task ID, and task groups are also defined as a dictionary where the key is the group name:

example_dag_factory_tasks_taskgroups_as_dict_format.yml
basic_example_dag_dict_format:
  default_args:
    owner: "custom_owner"
  description: "this is an example dag using dictionary format for tasks and task_groups"
  schedule_interval: "0 3 * * *"
  render_template_as_native_obj: True
  catchup: false
  task_groups:
    example_task_group:
      tooltip: "this is an example task group"
      dependencies: [task_1]
  tasks:
    task_1:
      operator: airflow.operators.bash.BashOperator
      bash_command: "echo 1"
    task_2:
      operator: airflow.operators.bash.BashOperator
      bash_command: "echo 2"
      dependencies: [task_1]
    task_3:
      operator: airflow.operators.bash.BashOperator
      bash_command: "echo 3"
      dependencies: [task_1]
      task_group_name: example_task_group

Format Recommendation

While both formats are supported, we recommend using the list format as it is more readable and easier to maintain.

Reserved Keys

The DAG Factory designates certain YAML keys for internal processing. While these keys appear in your YAML files, they are reserved exclusively for specific internal functions and should not be redefined or used for other purposes:

  • __type__
  • __args__
  • __join__
  • __and__
  • __or__

Using these keys outside their intended internal roles may lead to unexpected behavior.

Check out more configuration params