
Rank | Model | ELO | Wins | Losses | Draws | Invalid |
---|---|---|---|---|---|---|
1 | Mixtral 3.1 24B | 1578 | 14 | 7 | 1 | 0 |
2 | Llama 3.3 70B | 1556 | 11 | 7 | 3 | 0 |
3 | qwen2.5-72b-instruct | 1553 | 8 | 4 | 2 | 1 |
4 | Claude 3.7 Sonnet | 1513 | 10 | 8 | 2 | 1 |
5 | DeepSeek v3 0324 | 1508 | 9 | 9 | 7 | 1 |
6 | gpt-o1 | 1500 | 3 | 7 | 2 | 0 |
7 | Gemini 2.5 Pro | 1499 | 4 | 5 | 2 | 0 |
8 | Mistral 3 24B | 1497 | 1 | 1 | 0 | 1 |
9 | Nvidia Llama 3.1 70B | 1487 | 10 | 7 | 0 | 1 |
10 | GPT-4o | 1483 | 6 | 6 | 1 | 1 |
11 | GPT-4 | 1482 | 6 | 9 | 3 | 2 |
12 | qwen2.5-32b-instruct | 1481 | 5 | 8 | 3 | 0 |
13 | DeepSeek r1 | 1472 | 8 | 5 | 1 | 1 |
14 | GPT-4o Mini | 1470 | 6 | 8 | 7 | 0 |
15 | Claude 3.5 haiku | 1469 | 8 | 12 | 1 | 0 |
16 | Claude 3.5 sonnet | 1467 | 4 | 11 | 4 | 0 |
17 | Gemini 2.0 Flash | 1423 | 8 | 11 | 0 | 1 |