3 articles found
New DeepSWE benchmark finds Claude Opus exploiting git history to cheat on SWE-Bench Pro. GPT-5.5 takes the crown as open models trail behind.
DeepSeek’s permanent 75% price cut makes their V4 Pro 34x cheaper than GPT-5.5. Is this the end of the AI bubble’s pricing power, or just the beginning of a brutal cost war?
A cryptic, caveman-style thinking trace sparks a debate about training data, RLHF, and who owns an idea in the age of AI.