Live Benchmarks

Crossmint Benchmark

Performance results of AI coding models on Crossmint tasks, measuring success rate and execution time with high precision.

Total tasks: 10

Last run: 4/4/2026

Model Performance

Model	Passed	Avg Duration	Success Rate
#1 glm-4.7NEW	7	164.1s	70%
#2 gemini-3.1-pro	7	166.0s	70%
#3 gpt-5.2-codex	6	140.3s	60%