Hook¶
- class RayHook(*args: Any, **kwargs: Any)¶
Bases:
KubernetesHook
Airflow Hook for interacting with Ray Job Submission Client and Kubernetes Cluster.
This hook provides methods to interact with Ray clusters, submit and manage Ray jobs, and handle Kubernetes resources related to Ray deployments.
- Parameters:
conn_id – The connection ID to use when fetching connection info.
- DEFAULT_NAMESPACE = 'default'¶
- conn_name_attr = 'conn_id'¶
- conn_type = 'ray'¶
- create_daemon_set(name: str, body: dict[str, Any]) V1DaemonSet | None ¶
Create a DaemonSet resource in Kubernetes.
- Parameters:
name – The name of the DaemonSet.
body – The body of the DaemonSet for the create action.
- Returns:
The created DaemonSet resource if successful, None otherwise.
- default_conn_name = 'ray_default'¶
- delete_daemon_set(name: str) V1Status | None ¶
Delete a DaemonSet resource in Kubernetes.
- Parameters:
name – The name of the DaemonSet.
- Returns:
The status of the delete operation if successful, None otherwise.
- delete_ray_cluster(ray_cluster_yaml: str, gpu_device_plugin_yaml: str) None ¶
Execute the operator to delete the Ray cluster.
- Parameters:
ray_cluster_yaml – Path to the YAML file defining the Ray cluster.
gpu_device_plugin_yaml – Path or URL to the GPU device plugin YAML.
- Raises:
AirflowException – If there’s an error deleting the Ray cluster.
- delete_ray_job(dashboard_url: str | None, job_id: str) Any ¶
Deletes a job from the Ray cluster.
- Parameters:
dashboard_url – The URL of the Ray dashboard.
job_id – The ID of the job to delete.
- Returns:
The result of the delete operation.
- classmethod get_connection_form_widgets() dict[str, Any] ¶
Return connection widgets to add to connection form.
- Returns:
A dictionary of connection form widgets.
- get_daemon_set(name: str) V1DaemonSet | None ¶
Retrieve a DaemonSet resource in Kubernetes.
- Parameters:
name – The name of the DaemonSet.
- Returns:
The DaemonSet resource if found, None otherwise.
- get_ray_job_logs(dashboard_url: str | None, job_id: str) str ¶
Retrieves the logs of a submitted job.
- Parameters:
dashboard_url – The URL of the Ray dashboard.
job_id – The ID of the job.
- Returns:
Logs of the job.
- get_ray_job_status(dashboard_url: str | None, job_id: str) JobStatus ¶
Gets the status of a submitted job.
- Parameters:
dashboard_url – The URL of the Ray dashboard.
job_id – The ID of the job.
- Returns:
Status of the job.
- async get_ray_tail_logs(dashboard_url: str | None, job_id: str) AsyncIterator[str] ¶
Tails the logs of a submitted job asynchronously.
- Parameters:
dashboard_url – The URL of the Ray dashboard.
job_id – The ID of the job.
- Returns:
An async iterator of log lines.
- classmethod get_ui_field_behaviour() dict[str, Any] ¶
Return custom field behaviour for the connection form.
- Returns:
A dictionary specifying custom field behaviour.
- hook_name = 'Ray'¶
- install_kuberay_operator(version: str = '1.0.0', env: dict[str, str] | None = None) tuple[str | None, str | None] ¶
Install KubeRay operator using Helm.
- Parameters:
version – The version of KubeRay operator to install.
env – Optional dictionary of environment variables.
- Returns:
A tuple containing the installation output and error (if any).
- load_yaml_content(path_or_link: str) Any ¶
Load YAML content from a file path or URL.
- Parameters:
path_or_link – The file path or URL of the YAML content.
- Returns:
The loaded YAML content.
- ray_client(dashboard_url: str | None = None) JobSubmissionClient ¶
Establishes a connection to the Ray Job Submission Client.
- Parameters:
dashboard_url – The URL of the Ray dashboard.
- Returns:
An instance of JobSubmissionClient.
- Raises:
AirflowException – If the connection fails.
- setup_ray_cluster(context: airflow.utils.context.Context, ray_cluster_yaml: str, kuberay_version: str, gpu_device_plugin_yaml: str, update_if_exists: bool) None ¶
Execute the operator to set up the Ray cluster.
- Parameters:
context – The Airflow task context.
ray_cluster_yaml – Path to the YAML file defining the Ray cluster.
kuberay_version – Version of KubeRay to install.
gpu_device_plugin_yaml – Path or URL to the GPU device plugin YAML.
update_if_exists – Whether to update the cluster if it already exists.
- Raises:
AirflowException – If there’s an error setting up the Ray cluster.
- submit_ray_job(dashboard_url: str, entrypoint: str, runtime_env: dict[str, Any] | None = None, **job_config: Any) str ¶
Submits a job to the Ray cluster.
- Parameters:
dashboard_url – The URL of the Ray dashboard.
entrypoint – The command or script to run.
runtime_env – The runtime environment for the job.
job_config – Additional job configuration parameters.
- Returns:
Job ID of the submitted job.
- uninstall_kuberay_operator() tuple[str | None, str | None] ¶
Uninstall KubeRay operator using Helm.
- Returns:
A tuple containing the uninstallation output and error (if any).