Rendered at 10:27:55 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
ricardobeat 13 hours ago [-]
It’s interesting how little press Minimax M3 gets, given it outperforms Deepseek V4 Pro, previously the SOTA for open models. Meanwhile GLM has been in the news daily.
Reubend 7 hours ago [-]
It is strange, huh? But the hype cycles around these models often ignore good contenders. Xiaomi's MiMo-V2.5 Pro was doing really well and didn't get much hype either.
besterman23 18 hours ago [-]
I wonder if multiple attempts at the opossum would produce better results.
If we didn’t have the previous example I would interpret this as pretty solid evidence that labs were training on the Pelican “benchmark”.
I just can’t imagine a model dropping so significantly from one version to the next on such a silly task.
ChrisArchitect 14 hours ago [-]
Related:
GLM-5.2 is the new leading open weights model on Artificial Analysis
If we didn’t have the previous example I would interpret this as pretty solid evidence that labs were training on the Pelican “benchmark”.
I just can’t imagine a model dropping so significantly from one version to the next on such a silly task.
GLM-5.2 is the new leading open weights model on Artificial Analysis
https://news.ycombinator.com/item?id=48567759