Current theme: dark
← Back to Leaderboard
Task
Detailed breakdown of individual task performance across different models.
Status
All
Models
All
Task Name (10 tasks)
gemini-3.1-pro
glm-4.7
gpt-5.2-codex
agent_treasury_orchestrator
90.5s
148.2s
149.6s
batch_nft_airdrop_from_list
37.1s
43.3s
33.1s
create_crossmint_wallet
194.3s
144.7s
91.6s
debug_invalid_api_key_scope
86.4s
84.9s
55.6s
fix_broken_sdk_wallet_script
228.1s
119.2s
123.1s
mint_nft_to_email
56.1s
36.2s
35.2s
nft_receipt_checkout_flow
64.0s
54.6s
46.5s
poll_mint_action_status
36.2s
47.4s
46.5s
sdk_wallet_sign_message
280.6s
253.4s
49.5s
wallet_send_usdc_transfer
230.5s
109.0s
48.1s