Performance results of AI coding models on Autumn tasks, measuring success rate and execution time with high precision.
| Model | Passed | Avg Duration | Success Rate |
|---|---|---|---|
| #1 glm-4.7NEW | 7 | 378.0s | 70% |
| #2 gemini-3.1-pro | 6 | 243.8s | 60% |
| #3 claude-4-6-sonnet | 4 | 166.6s | 40% |
| #4 gpt-5.2-codex | 2 | 140.9s | 20% |