Current theme: dark
← Back to Leaderboard
Task
Detailed breakdown of individual task performance across different models.
Status
All
Models
All
Task Name (30 tasks)
claude-4-6-sonnet
gemini-3-flash
glm-4.7
gpt-5.2-codex
flow_approval_step
70.3s
28.6s
125.5s
44.0s
flow_conditional_branch
75.0s
64.7s
91.5s
42.7s
flow_error_handler
35.2s
52.0s
89.6s
49.3s
flow_parallel_branches
59.8s
65.0s
101.4s
63.1s
flow_sequential_steps
37.2s
38.7s
193.3s
43.1s
full_script_workspace
95.6s
57.4s
100.8s
33.7s
python_basic_script
21.8s
22.8s
82.2s
40.1s
python_batch_processor
39.7s
42.8s
103.5s
37.6s
python_class_script
35.8s
38.7s
87.0s
32.2s
python_data_validation
45.4s
52.5s
144.1s
46.4s
python_error_handling
41.6s
38.5s
97.5s
29.4s
python_json_processor
47.3s
49.1s
94.7s
28.7s
python_list_transform
25.0s
32.9s
68.4s
31.9s
python_math_utils
32.0s
34.4s
83.9s
38.6s
python_nested_dict
48.1s
48.0s
195.9s
32.0s
python_optional_params
19.7s
30.8s
108.3s
38.1s
python_return_dict
19.0s
3259.1s
47.0s
34.9s
python_webhook_handler
44.3s
54.9s
142.5s
25.5s
typescript_array_filter
44.2s
41.8s
79.9s
42.3s
typescript_async_transform
24.0s
39.4s
74.1s
59.9s
typescript_basic_script
21.6s
34.4s
69.2s
37.7s
typescript_data_pipeline
40.4s
46.9s
76.5s
54.3s
typescript_date_utils
27.4s
50.6s
48.3s
23.4s
typescript_environment_config
31.8s
37.4s
91.2s
35.4s
typescript_error_handling
25.3s
23.0s
49.8s
39.2s
typescript_multi_helper
25.7s
37.4s
78.2s
44.2s
typescript_object_param
29.2s
123.8s
45.1s
29.9s
typescript_return_object
19.2s
32.4s
64.0s
33.6s
typescript_string_builder
27.7s
35.0s
50.2s
48.9s
typescript_with_defaults
24.5s
27.7s
59.2s
41.5s