List of available operators for GCP pipeline dsl

https://cloud.google.com/vertex-ai/docs/pipelines/gcpc-list

Lightweight python components

https://www.kubeflow.org/docs/components/pipelines/sdk/python-function-components/

example notebook

https://github.com/kubeflow/pipelines/blob/master/samples/core/lightweight_component/lightweight_component.ipynb

https://medium.com/@gkkarobia/kubeflow-pipelines-part-1-lightweight-components-a4a3c8cb3f2d

See the entire section of Google Cloud Pipelines Components reference for all options of provided operators: Dataflow, Dataproc, BigQuery, etc.

https://cloud.google.com/vertex-ai/docs/pipelines/dataflow-component

how to define a CustomJob pipeline component

https://cloud.google.com/vertex-ai/docs/training/create-custom-job

submitting exact specs for workers in the pipeline is available for custom training job operators:

https://cloud.google.com/vertex-ai/docs/pipelines/customjob-component

as worker pool specs

https://cloud.google.com/vertex-ai/docs/reference/rest/v1/CustomJobSpec#workerpoolspec

but in order to just configure memory/cpu/machine selector constraint for any kubeflow component one can just use set_cpu_limit/add_node_selector_constaint functions on the components

https://cloud.google.com/vertex-ai/docs/pipelines/machine-types