Performance results of AI coding models on Modal tasks, measuring success rate and execution time with high precision.
| Model | Passed | Avg Duration | Success Rate |
|---|---|---|---|
| #1 gemini-3.1-proNEW | 10 | 221.8s | 91% |
| #2 gpt-5.2-codex-with-skills | 9 | 139.9s | 82% |
| #3 claude-4-6-sonnet | 9 | 207.9s | 82% |
| #4 glm-4.7 | 8 | 313.2s | 73% |
| #5 gpt-5.2-codex | 7 | 178.4s | 64% |