Course

End-to-End Multimodal AI Pipeline Overview

Learn the end-to-end multi-modal AI pipeline, including how each component fits together and what the project provides. You’ll also gain hands-on experience running and exploring the implementation using Anyscale or the GitHub repository.

About this course

25 lessons

4 modules

1.Overview

Get an end-to-end overview of the multi-modal AI pipeline and how its components fit together. You’ll learn what the project provides and how to run and explore the implementation via Anyscale or the GitHub repository.

Multi-modal AI pipeline

2.Batch Inference

Learn how to use Ray Data to ingest a large image dataset from cloud storage, enrich each record with labels, and run scalable batch inference by generating CLIP embeddings with `map_batches`. You’ll see how to structure preprocessing and model execution for efficient, streaming, distributed batch pipelines.

Batch inference
Data ingestion
Batch embeddings
Ray Data
Data storage
Monitoring and Debugging
Production jobs
Similar images

+5 more lessons

3.Distributed Training

Learn how to scale image model training across multiple workers using Ray Train, including setting up the runtime, ingesting and preprocessing datasets with Ray Data, and converting classes to numeric labels. By the end, you’ll have a distributed-ready training pipeline with reusable preprocessing (including optional embedding computation) for efficient large-scale training.

Distributed training
Preprocess
Model
Batching
Model registry
Training
Ray Train
Production Job
Evaluation

+6 more lessons

4.Online Serving

Learn how to deploy a trained image classification model as a scalable online API using Ray Serve and FastAPI, including configuring GPU resources and replica scaling. You’ll also integrate MLflow to load the best model artifacts and send real-time prediction requests via an HTTP `/predict` endpoint.

Online serving
Deployments
Application
Ray Serve
Observability
Production services
CI/CD

+4 more lessons