A policy learning workload example utilizing Ray Train.
Learn how to build and run an end-to-end diffusion-policy training workload for the `Pendulum-v1` control task using a real offline dataset, from data generation/preprocessing with Ray Data to distributed training on an Anyscale cluster with Ray Train V2. You’ll accomplish migrating a local PyTorch + Gymnasium workflow into a scalable, fault-tolerant Ray pipeline with minimal code changes.