newsio aggregates and links to original sources. We do not own the original images or content. If you believe content infringes on intellectual property rights, contact us — it will be removed at first notice.
genai/news//Sputnik News
M3 processes up to 1 million tokens at once - 5x more than its predecessor.
M3 can process up to 1 million tokens at once, 5x more than its predecessor.
KEY POINTS
M3 achieved a 59% score on SWE-Bench Pro, surpassing GPT-5.5 and Gemini 3.1 Pro.
M3's Sparse Attention architecture reduces compute needs by up to 95% and cuts costs by 90%.
M3 autonomously raised NVIDIA Hopper chip utilization from 7.6% to 71.3% in benchmarks.
M3 processes up to 1 million tokens at once - 5x more than its predecessor, enabling it to handle massive codebases
The model scored 59% on SWE-Bench Pro, outperforming OpenAI’s GPT-5.5 and Google’s Gemini 3.1 Pro in real-world software engineering tests
Its new Sparse Attention architecture cuts computing requirements to as little as 1/20th of previous levels, reducing costs by over 90% while enhancing speed
In one benchmark, M3 autonomously optimized software for NVIDIA Hopper chips, boosting hardware utilization from 7.6% to 71.3%