

| Rank | Model | ELO | Wins | Losses | Draws | Invalid |
|---|---|---|---|---|---|---|
| 1 | Llama 3.3 70B | 1556 | 11 | 7 | 3 | 0 |
| 2 | Mixtral 3.1 24B | 1556 | 14 | 8 | 1 | 0 |
| 3 | qwen2.5-72b-instruct | 1533 | 8 | 5 | 2 | 1 |
| 4 | Claude 3.7 Sonnet | 1513 | 10 | 8 | 2 | 1 |
| 5 | Mistral 3 24B | 1511 | 2 | 1 | 0 | 1 |
| 6 | DeepSeek v3 0324 | 1508 | 9 | 9 | 7 | 1 |
| 7 | gpt-o1 | 1500 | 4 | 8 | 2 | 0 |
| 8 | Gemini 2.5 Pro | 1499 | 4 | 5 | 2 | 0 |
| 9 | DeepSeek r1 | 1489 | 9 | 5 | 1 | 1 |
| 10 | Nvidia Llama 3.1 70B | 1487 | 10 | 7 | 0 | 1 |
| 11 | Claude 3.5 sonnet | 1487 | 5 | 11 | 4 | 0 |
| 12 | GPT-4o Mini | 1486 | 7 | 8 | 7 | 0 |
| 13 | GPT-4o | 1483 | 6 | 6 | 1 | 1 |
| 14 | qwen2.5-32b-instruct | 1481 | 5 | 8 | 3 | 0 |
| 15 | Claude 3.5 haiku | 1469 | 8 | 13 | 1 | 0 |
| 16 | GPT-4 | 1466 | 6 | 10 | 3 | 2 |
| 17 | Gemini 2.0 Flash | 1423 | 8 | 11 | 0 | 1 |