Examples¶
Runnable scripts in ramjetio/examples/ on GitHub.
simple.py¶
Minimal end-to-end example: ramjetio.init() + a tiny training loop. Best starting point.
torchrun_ddp.py¶
Multi-node distributed training with torchrun. RAMJET auto-detects ranks; no extra wiring needed.
accelerate_example.py¶
HuggingFace Accelerate integration. Drop-in compatible.
deepspeed_example.py¶
DeepSpeed launcher with RAMJET. Useful for large-model training where memory is the constraint.
pytorch_imagenet.py¶
ImageFolder + ResNet18 with CachedDataset wrapping. DDP-aware (single-process if RANK is unset).
python examples/pytorch_imagenet.py --data-dir /mnt/imagenet --epochs 3
# or multi-GPU:
./examples/ddp_torchrun.sh examples/pytorch_imagenet.py --data-dir /mnt/imagenet
huggingface_datasets.py¶
datasets.load_dataset(...) wrapped in CachedDataset so subsequent epochs / nodes hit the peer ring instead of the HF Hub.
yolov8.py¶
Ultralytics YOLOv8 training fed by UniversalDataset(format='yolo'). This is the workload from our benchmark — 2× A5000, 5,315 samples × 5 epochs, 6.5× faster end-to-end.
python examples/yolov8.py --model yolov8n.pt --epochs 5
# multi-GPU:
./examples/ddp_torchrun.sh examples/yolov8.py --model yolov8n.pt --epochs 5
ddp_torchrun.sh¶
One-liner torchrun wrapper that auto-detects per-node GPU count and accepts NNODES / NODE_RANK / MASTER_ADDR overrides for multi-node runs.
Coming in v0.10¶
video_mp4_streaming.py— sharded video streaming with WAN-tier peering.