Cheaper & Smarter: Claude Opus 4.5 is the new BEST model | DailyDevLists

Loading video player...

Full Transcript

887 words • EN

We are now at this point of the model

release cycle. That's right. Anthropic

just released Opus 4.5. And I've jumped

straight to what you want to hear. Yes,

it is the best on the coding benchmarks.

But even better, the price is now three

times cheaper. It's only going to be $5

for a million input tokens now and $25

for a million output. That is going to

be the big headline of this release.

Opus now might actually become the model

that you're going to use most of the

time instead of saving it for just the

important tasks. The first thing that I

actually tried out on Opus 4.5 here is

the same test that I gave to Gemini 3

Pro when it released, and that's

generating a Minecraft clone from a

single prompt. You can see what Opus 4.5

has come up with here. And I've got to

say, using this, this is the best result

that I've ever gotten back from a model

on this single prompt test. You can see

we're able to move around. Everything is

pretty smooth. The FPS is really nice.

We can break blocks here, and we can

place them and even select our blocks

down in the block selector and fly

around the map if we want to as well.

This feels fully playable based on the

prompt that I gave it. I even managed to

write out subscribe, something you

should definitely do. And I just noticed

it has a day and night cycle as well.

For comparison, this is what Gemini 3

Pro gave me last week when I was testing

it out. You can see we do have a world

that is procedurally generating. I would

say it does look okay for a single

prompt experiment, but we aren't able to

break blocks. You can see the movement

is a little bit chaotic, and we can't

place them either. So, to me, Opus 4.5

is an absolute massive winner on this

single prompt Minecraft test that I like

to run on these models. And overall,

I've just gotten really good results

with it. You can see I asked it in a

single prompt here to build me a Lego

builder website that utilizes 3JS to

allow the user to build various Lego

pieces. And the result I got back was a

completely working Lego builder. You can

see we can pan around here. We can stack

pieces on top of each other. We can

change the color. We can switch this

into remove mode. And we can even choose

different Lego pieces. I'm actually just

amazed that we're at the point where

models can generate this so easily.

Obviously, those prompts were pretty

simple ones, but you'll see in the

benchmarks later, this model can

definitely code. But first, what I think

is more interesting about Opus 4.5 is it

uses dramatically fewer tokens than its

predecessors to reach similar or better

outcomes. So when you combine this with

the new effort parameter that the model

has, you can actually choose between

minimizing your time and spend or

maximizing its capability. In that

testing, they actually found that Opus

4.5 with a medium effort matches Sonic

4.5's best SWB verified score, but with

76% fewer output tokens. And even when

you have it on its highest effort level,

it actually beats Sonic 4.5 and uses 48%

fewer tokens doing so. When you combine

that with the fact that it's now three

times cheaper, I think this model is

going to become a daily driver for a lot

of people. If we take a look at the

benchmark results now, you can see it's

the best at software engineering with

GPT 5.1 Codeex Max coming in second. And

something quite funny is they actually

tested OPUS against their own take-home

exam that they give to prospective

employees. And Opus 4.5 scored higher

than any human candidate has ever scored

on this test. Is it over for us? Well,

if we take a look at the Arc AGI

benchmark, it actually scores second

best on this and it only loses out

slightly to the Deep Think Mode of

Gemini 3 Pro, but you can see it is a

massive leap since Opus 4.1. Taking a

look at the rest of the benchmark suite,

it wins on most of these. It only loses

out on graduate reasoning, visual

reasoning, and multilingual Q&A. But as

you can see, Anthropic really does focus

on that coding part. So, if you're not

using it for coding, it's still going to

be incredibly good, just maybe 1% less

than others. Another funny benchmark

that I actually noticed it was losing on

is the vending machine benchmark. Gemini

3 Pro can apparently still make you just

a little bit more money when it runs a

vending machine. Overall though, I think

the lower price of this model is going

to be incredibly compelling, especially

for organizations, and it just shows

that Opus is maintaining that reputation

of being incredible at coding. Let me

know what you think in the comments down

below. Are you going to switch over to

this model now that it's a little bit

cheaper? While you're there, subscribe.

And as always, see you in the next one.

Cheaper & Smarter: Claude Opus 4.5 is the new BEST model

Better Stack

52 days ago

3:51

Claude & Anthropic Ecosystem

Rank #3

Description

Anthropic just dropped Opus 4.5. Its cheaper, smarter, and an absolute beast at coding. Here’s everything new, the benchmarks, and whether it’s finally time to switch. Let me know what you think! 🔗 Relevant Links Opus 4.5: https://www.anthropic.com/news/claude-opus-4-5 ❤️ More about us Radically better observability stack: https://betterstack.com/ Written tutorials: https://betterstack.com/community/ Example projects: https://github.com/BetterStackHQ 📱 Socials Twitter: https://twitter.com/betterstackhq Instagram: https://www.instagram.com/betterstackhq/ TikTok: https://www.tiktok.com/@betterstack LinkedIn: https://www.linkedin.com/company/betterstack 📌 Chapters: 0:00 Intro 0:12 Pricing 0:25 Coding 1:49 Token Efficiency Improvements 2:28 Benchmarks 3:23 Thoughts

Watch on YouTube

Video Details

Category

Claude & Anthropic Ecosystem

Featured Date

November 25, 2025

Quality Rank

#3

AI Recommended