Current theme: dark
← Back to Leaderboard
Task
Detailed breakdown of individual task performance across different models.
Status
All
Models
All
Task Name (11 tasks)
claude-4-6-sonnet
gemini-3.1-pro
glm-4.7
gpt-5.2-codex
gpt-5.2-codex-with-skills
document_analytics_pipeline
88.6s
66.8s
334.3s
120.7s
58.3s
etl_pipeline_orchestrator
173.5s
95.2s
199.7s
99.3s
56.9s
hyperparameter_sweep_orchestrator
112.1s
123.2s
251.5s
82.6s
59.2s
ml_serving_pipeline_checkpoint
149.9s
282.6s
351.5s
471.7s
204.6s
modal_doc_ocr_pipeline
106.3s
96.8s
171.6s
88.8s
91.5s
modal_event_pipeline
91.9s
144.1s
130.1s
109.9s
99.6s
modal_gpu_batch_inference
130.6s
161.5s
201.0s
45.6s
41.0s
modal_gpu_embedding_service
335.2s
600.1s
600.0s
52.4s
67.0s
modal_hp_sweep_checkpoint
129.0s
71.3s
318.9s
59.6s
91.4s
modal_nfs_etl_pipeline
139.5s
100.0s
91.8s
37.9s
47.0s
rate_limited_api_gateway
214.1s
128.8s
166.5s
107.4s
92.6s