anyscale_provider.operators package#
Submodules#
anyscale_provider.operators.anyscale module#
- class RolloutAnyscaleService(*args: Any, **kwargs: Any)#
Bases:
BaseOperator
Rolls out a service on Anyscale from Apache Airflow.
This operator handles the deployment of services on Anyscale, including the necessary configurations and options. It ensures the service is rolled out according to the specified parameters and handles the deployment lifecycle.
- Parameters:
conn_id – Required. The connection ID for Anyscale.
name – Required. Unique name of the service.
image_uri – Optional. URI of an existing image. Exclusive with containerfile.
containerfile – Optional. The file path to a containerfile that will be built into an image before running the workload. Exclusive with image_uri.
compute_config – Optional. The name of an existing registered compute config or an inlined ComputeConfig object.
working_dir – Optional. Directory that will be used as the working directory for the application. If a local directory is provided, it will be uploaded to cloud storage automatically. When running inside a workspace, this defaults to the current working directory (‘.’).
excludes – Optional. A list of file path globs that will be excluded when uploading local files for working_dir.
requirements – Optional. A list of requirements or a path to a requirements.txt file for the workload. When running inside a workspace, this defaults to the workspace-tracked requirements.
env_vars – Optional. A dictionary of environment variables that will be set for the workload.
py_modules – Optional. A list of local directories that will be uploaded and added to the Python path.
cloud – Optional. The Anyscale Cloud to run this workload on. If not provided, the organization default will be used (or, if running in a workspace, the cloud of the workspace).
project – Optional. The Anyscale project to run this workload in. If not provided, the organization default will be used (or, if running in a workspace, the project of the workspace).
applications – Required. List of Ray Serve applications to run. At least one application must be specified. For details, see the Ray Serve config file format documentation: https://docs.ray.io/en/latest/serve/production-guide/config.html.
query_auth_token_enabled – Optional. Whether or not queries to this service is gated behind an authentication token. If True, an auth token is generated the first time the service is deployed. You can find the token in the UI or by fetching the status of the service.
http_options – Optional. HTTP options that will be passed to Ray Serve. See https://docs.ray.io/en/latest/serve/production-guide/config.html for supported options.
grpc_options – Optional. gRPC options that will be passed to Ray Serve. See https://docs.ray.io/en/latest/serve/production-guide/config.html for supported options.
logging_config – Optional. Logging options that will be passed to Ray Serve. See https://docs.ray.io/en/latest/serve/production-guide/config.html for supported options.
ray_gcs_external_storage_config – Optional. Configuration options for external storage for the Ray Global Control Store (GCS).
in_place – Optional. Flag for in-place updates. Defaults to False.
canary_percent – Optional[float]. Percentage of canary deployment. Defaults to None.
max_surge_percent – Optional[float]. Maximum percentage of surge during deployment. Defaults to None.
service_rollout_timeout_seconds – Optional[int]. Duration after which the trigger tracking the model deployment times out. Defaults to 600 seconds.
poll_interval – Optional[int]. Interval to poll the service status. Defaults to 60 seconds.
- execute(context: airflow.utils.context.Context) None #
Execute the service rollout to Anyscale.
This method deploys the service to Anyscale with the provided configuration and parameters. It defers the execution to a trigger if the service is in progress.
- Parameters:
context – The Airflow context.
- Returns:
The service ID if the rollout is successfully initiated, or None if the job is deferred.
- execute_complete(context: airflow.utils.context.Context, event: Any) None #
Complete the execution of the service rollout based on the trigger event.
This method is called when the trigger fires and provides the final status of the service rollout. It raises an exception if the rollout failed.
- Parameters:
context – The Airflow context.
event – The event data from the trigger.
- Returns:
None
- hook() AnyscaleHook #
Return an instance of the AnyscaleHook.
- on_kill() None #
Terminate the Anyscale service rollout if the task is killed.
This method will be called when the task is killed, and it sends a termination request for the currently running service rollout.
- template_fields = ('conn_id', 'name', 'image_uri', 'containerfile', 'compute_config', 'working_dir', 'excludes', 'requirements', 'env_vars', 'py_modules', 'cloud', 'project', 'applications', 'query_auth_token_enabled', 'http_options', 'grpc_options', 'logging_config', 'ray_gcs_external_storage_config', 'in_place', 'canary_percent', 'max_surge_percent', 'service_rollout_timeout_seconds', 'poll_interval')#
- class SubmitAnyscaleJob(*args: Any, **kwargs: Any)#
Bases:
BaseOperator
Submits a job to Anyscale from Apache Airflow.
This operator handles the submission and management of jobs on Anyscale. It initializes with the necessary parameters to define and configure the job, and provides mechanisms for job submission, status tracking, and handling job outcomes.
- Parameters:
conn_id – Required. The connection ID for Anyscale.
entrypoint – Required. Command that will be run to execute the job, e.g., python main.py.
name – Optional. Name of the job. Multiple jobs can be submitted with the same name.
image_uri – Optional. URI of an existing image. Exclusive with containerfile.
containerfile – Optional. The file path to a containerfile that will be built into an image before running the workload. Exclusive with image_uri.
compute_config – Optional. The name of an existing registered compute config or an inlined ComputeConfig object.
working_dir – Optional. Directory that will be used as the working directory for the application. If a local directory is provided, it will be uploaded to cloud storage automatically. When running inside a workspace, this defaults to the current working directory (‘.’).
excludes – Optional. A list of file path globs that will be excluded when uploading local files for working_dir.
requirements – Optional. A list of requirements or a path to a requirements.txt file for the workload. When running inside a workspace, this defaults to the workspace-tracked requirements.
env_vars – Optional. A dictionary of environment variables that will be set for the workload.
py_modules – Optional. A list of local directories that will be uploaded and added to the Python path.
cloud – Optional. The Anyscale Cloud to run this workload on. If not provided, the organization default will be used (or, if running in a workspace, the cloud of the workspace).
project – Optional. The Anyscale project to run this workload in. If not provided, the organization default will be used (or, if running in a workspace, the project of the workspace).
max_retries – Optional. Maximum number of times the job will be retried before being marked failed. Defaults to 1.
- execute(context: airflow.utils.context.Context) str #
Execute the job submission to Anyscale.
This method submits the job to Anyscale and handles its initial status. It defers the execution to a trigger if the job is still running or starting.
- Parameters:
context – The Airflow context.
- Returns:
The job ID if the job is successfully submitted and completed, or None if the job is deferred.
- execute_complete(context: airflow.utils.context.Context, event: Any) str #
Complete the execution of the job based on the trigger event.
This method is called when the trigger fires and provides the final status of the job. It raises an exception if the job failed.
- Parameters:
context – The Airflow context.
event – The event data from the trigger.
- Returns:
None
- hook() AnyscaleHook #
Return an instance of the AnyscaleHook.
- on_kill() None #
Terminate the Anyscale job if the task is killed.
This method will be called when the task is killed, and it sends a termination request for the currently running job.
- template_fields = ('conn_id', 'entrypoint', 'name', 'image_uri', 'containerfile', 'compute_config', 'working_dir', 'excludes', 'requirements', 'env_vars', 'py_modules', 'cloud', 'project', 'max_retries', 'fetch_logs', 'wait_for_completion', 'job_timeout_seconds', 'poll_interval')#