Tabular Data Workload Example (Ray Train)

1.Workload

In this Workload module, you’ll learn how to scale a tabular XGBoost forest-cover classification pipeline from local training to a distributed Ray Train V2 job on an Anyscale cluster. You’ll ingest the UCI Cover Type dataset, persist train/validation Parquet splits to shared storage, and train/evaluate the model using Ray Datasets and distributed execution.

b Tabular workload pattern with Ray Train
Imports
Define the Ray Train worker loop (Arrow-based, memory-efficient)
Confusion matrix visualization
Continue training from the latest checkpoint

+2 more lessons

Tabular Data Workload Example (Ray Train)

About this course

1.Workload

1.Workload