Open AI Just Dropped a Code Editor Extension & It's Good! | DailyDevLists

Open AI Just Dropped a Code Editor Extension & It's Good! | DailyDevLists

Loading video player...

Full Transcript

4,714 words • EN

We're going to look at the Open AI

Codeex releases and see if they're a

worthy replacement for cursor and claw

code. And not just that, we're going to

build a microsass idea, a outfit tryon

widget using Google's just released nano

banana, which is super powerful at image

editing.

So, OpenAI are tackling all fronts here

and giving us a lot of options. We now

have an IDE plugin. We have the Codex

CLI and we have Codex cloud. So this is

a really useful ecosystem and something

similar to what cursor have as well. So

what OpenAI are promoting here is a kind

of unified environment. You might work

in the IDE when you're working on human

in the loop code. You might involve the

CLI if you're wanting to work aically

and have it run on a server and you can

have all of them working together or

offload one to the other. This is very

similar to what cursor offers with its

ID, its CLI and its background agents.

So, not unlike cursor or GitHub copilot,

you can now um add codeex to your pull

requests in GitHub. And we have some

updates to the codeex CLI. So, we're

going to take a look at that and see

what's changed as well and how it

compares in relation to claw code. Just

a quick one guys, if you're looking to

become an AI native developer with the

ability to build apps without writing

any code, you should check out the build

with AI switch dimension course. There

is a ton of modules in there and I just

introduced a whole new section on claw

code to help you get up to speed really

quickly. So first up let's look at the

codeex extension. I got cursor open here

but this will work in VS code and any

fork of VS code like windsurf etc. So

just go up to your extensions here and

then we're going to type in codeex and

then you just click on this to install

it and then it's going to appear here on

your left hand side. So once it's open

there you can just drag it over to the

right. This usually where I keep my

agent. So, click on this button to start

a new chat. If we click on this, we get

a task history. So, this extension has a

task list just like you'd have in cloud

code or cursor. And that's also been

added to the Codeex CLI as well. Down

here, we can add context of any files we

want to add in. And we can now add

images as well. You can also add images

to the Codex CLI. Now, and we have this

button here for auto context management.

So, it will decide what recent files it

should pull in. So we have three

different modes. We have chat mode. So

this is kind of like a planning mode

where you just want to talk to the agent

about your code, but it doesn't

necessarily implement anything and it

requires uh approval to run any kind of

command or an edit. Then we have agent

mode. It's able to jump ahead, think

agentively and read your files, make

edits, run GPS, all those different

things like that and run commands in the

workspace. And then we have full access

mode. So it can actually go and search

the internet and edit files outside the

workspace. It seems to be like the

dangerously skip permissions mode in

claw code or the yolo mode in cursor. So

this is where it gets a little bit

confusing. You can set different levels

of reasoning effort. You have minimal,

low, medium, and high. Originally, I

thought these reasoning efforts map to

particular models like the min, mini,

nano, or high models. But it doesn't

seem to be as simple as that. It's

stated in the documentation that it's

the amount of reasoning effort that is

applied. So, it's hard to get an idea of

how these are priced when you're working

with them. And there is a little bit of

work to be done on OpenAI's site to

understand the pricing and the level of

allowance that you get for your $20 or

greater uh per month. It seems to be

they're going to run this very similar

to how Claw Code is running it. You get

a certain allowance over you get a

certain allowance over a certain amount

of time and then it resets. So, also

like cursor, you have a local mode and

then a cloud mode. So in cursor it would

be basically just the local IDE agent or

the cursor CLI and then if you're

sending to cloud it would be the

background agents. So in this case

codeex web is your background agent to

run things in the cloud. So let's see

how it goes. So I'm going to just say

set up an XJS project with Chadcen

styling and lucid icons in this folder.

So straight away I think the UI is nice

here. Nice and clean. We've got this

working going on here explaining exactly

what's going on in the background and we

can expand each one of the thinking

steps. I actually really like this about

GPT5. It explains how it's thinking

through different problems. So if you

actually read through this stuff as it's

being generated, you can actually catch

the model making a mistake before it

does. And it also helps you understand

how it arrives at different decisions.

So we can set the permissions now by

saying run this time or run every time.

So I'm going to let it run every time.

We can see it's running through all the

different tasks here. And if I look at

my task history, I can see that we're

currently on this task at the moment.

So, it's finished working and it

successfully completed those steps in a

reasonable amount of time.

Pragmatically, normally what I do is I

just run those commands myself. I can do

it in a tenth of the time, but I just

wanted to get a feel for how this

interface looks. And it is quite nice

here. And you can go and click on

individual pages to take you through

those. And then we can see the files

that are changed here. And this is

really clean and really nice. I really

like the UI. So, I really like the UI

here. And we have options between

looking at what's changed. Okay, cool.

Okay, so I'm just going to spin up a

server here just to see if all that

works. So, I'm going to say mpm rundev

in the terminal here. Perfect. So,

simple task, but worked fine. So, now we

have a basic project set up locally. We

can also set that up in the cloud via

codeex web. So, let's just connect it

here. So, I've selected my GitHub

organization here and I've put in the

repository. I'm just going to click

that. I'm going to switch on code

reviews. I'm going to allow internet

access just while I'm setting up the

environment. Now, bearing in mind that

you are open to prompt injection when

you have internet access on. So, just be

wary of this. And I'm just going to hit

create environment. So now essentially I

have two different places where I can

run my commands and run my agents. This

is very similar to background agents in

cursor. So for example, I can just click

on the dictate icon here. Let's add a

hero section to our main page on the

next.js app. If I was working on a more

established app, I would pick a

different branch, but we're just going

to work on main here. And then you can

actually click the amount of versions

that you'd like to have developed. So,

we're going to go with two here and

check it out. So, you have two different

options here. Again, you can ask if this

is a request, and then you can actually

just go ahead and hit code. And when I

click on a task, I can see the two

different versions and the different

environments that are spinning up. So,

we can go back and check on that in a

second. Now, if I wanted to interact

with either one of these versions, I can

actually click on this button here and

it will open that for me in cursor. This

is again very similar to how cursor

works with background agents and the

IDE. I feel like OpenAI are taking a lot

of cues from the cursor workflow here.

So once the work is complete, we get

this kind of view. Essentially, we can

see what has been added. We've got the

two different versions and a summary of

the work and I can see what was applied.

So, I can imagine if you're wanting to

check out multiple different design

approaches for a web page, you could

have five different versions and choose

the best one. Or if you're looking to

solve a particularly complex problem,

you could have multiple different

versions applied. You're using a lot of

tokens here when you're doing this. But

I think this is a pretty cool

implementation. So, let's say I'm happy

with version one. I can go up and either

create a git apply, copy a git patch, or

just go and create a PR. And if I go and

hit view PR,

you can see that a pull request has been

created and I could go and accept that

then if I was happy. Okay. So now let's

check out the updates to the codeex CLI.

So the codec cli again being very

similar to cursor CLI and of course

clawed code. Not quite as advanced but

um it seems to be getting there. So just

copy this command npm install global

openAI codeex and then just go back over

to cursor again. Um, we can open up a

new terminal here and then I can just

paste in that command and hit return.

And then to run it, we just type in

codeex. So before we jump into using the

CLI, you might be asking what's the deal

with all these CLI tools? We have a tool

like windsurf or cursor or now this

extension of codeex. It's very visual.

It's in the IDE. Why do we need all

these different CLIs? And essentially

here is my take. So CLIs what they do is

they offer more agentic take on

development. So you can basically take

them you can run them in any IDE in any

terminal. You can run them in docker you

can run them on servers. You can

actually set them up via SDK so that you

can run different commands and have them

work in multiple different instances and

agents. So they're just a whole lot more

flexible. Also some people like working

in a CLI. They tend to be a lot faster

and have lots of advantages in terms of

extensibility. Now, in terms of UI uh

and user interface, they're not quite as

friendly. So, if you're a uh starting

out in development or you're looking to

prototype or build without writing code

and are a little bit overwhelmed by the

terminal, I would recommend using

something like cursor or using codecs or

rue or client built into an IDE as a

starting point. If you've got a

different take on CLIs versus IDEs, love

to see that in the comments. Okay, so in

the codec cli, the first thing you want

to do is hit /init. So what that's going

to do is create an agents.mmd file. So

that's very similar to the claude or

gemini.md. So agents.mmd is a new

standardization for uh agents file

similar to how we would use a readme. It

acts as context for your codebase and

any particular rules and approaches that

you take. And this standardization of

the file name of agents.mmd means now

that we can potentially move between

multiple different types of tools. So

you can see there's multiple different

companies that have signed on, but I

love to see this level of

standardization. What I like about GPT5

is that you do get a train of thought

being shared. And I've actually found

this really useful in development. And

the UI here is actually pretty nice with

some nice icons added in. If we type

slash here, we can see all the different

commands that we have available. So if I

hit model here, I can choose just like I

can in the extension between minimal,

low, medium, and high. Can set our

approval, start a new conversation. We

can compact our context. Now, adding

MCPS is a little bit different from how

you do it in other CLIs with a MCP.json

file. So, I'm just going to show you

that quickly because I found it hard to

find documentation on this myself. Maybe

it'll help you. So, you'll want to find

your settings directory. So, let's

change to your home directory if you're

in a Mac. So, just cd and tilda and

it'll bring you straight back there. So,

we want to change into the

directory.codeex. And then if we just do

ls to see what's in there, you'll see

that we have a config.toml.

So, then you can just open this config.l

pommel by typing nano config.l PL or

just go and find that file in cursor and

open it yourself. And you see you've got

all your config details details here.

And adding an MCP server is as simple as

what I have here. You type in MCP bright

data and then you've got your command,

your arguments, your environment, etc.

Just add them in like that, one on top

of another, and then they should be

listed. Now, you will have to restart

your CLI in order for them to be listed,

but then you should be good to go. So, I

used Codex CLI to build out a project

yesterday, and I have to say it was a

good experience. But that being said, it

is still early and if I was to compare

it side by side by with claw code, claw

code is still miles ahead. It has sub

aents, it has hooks and many more subtle

commands that are really just quite

useful as you work in a CLI. But that

being said, don't forget how fast and

how big of a company OpenAI is and how

fast they move. It'll be very

interesting to see what they bring to

their CLI over the next couple of weeks.

when you consider pricing of Claude to

GPT5 and the fact that their models are

achieving similar parity, I think OpenAI

are really going to give Antropic a run

for their money. So, the first thing I

want to do is create a basic e-commerce

page as the foundation for this

prototype we're developing. So, I have

my prompt here if you want to take a

look, pause it, and then I'm going to

hit run here. But, I'm going to switch

up to high reasoning. So, this is the

output we got from GBT5, and it was

clever enough to go and pull in some

Unsplash images just to populate them.

Now, some of them didn't work out, but

actually this works out pretty well.

This is just a linking issue, I think.

So, good start. So, out of interest, I

set up the same project with the same

configuration and ran exactly the same

prompt in clawed code just to see what

the output would be as a comparison. So,

in terms of output and design, very

similar, very comparable. So, if we take

a look at what we got from Chach GPT

women's collection filtering, quick view

appears over each image with this little

heart as well. And then if you look at

what we got from Claude, we have the

same type of filtering and then we have

a quick view appeared. But I suppose GPT

went that one extra step to bring in the

Unsplash images to populate the design.

So kind of think it's a bit of a draw

here in terms of design. Okay. So next

up, I'm going to add in a slide drawer

because this is where we want to put in

our try out tool. The ability to be able

to see whatever outfit is in there. I'm

going to drop that down to medium, I

think, and just going to run that. So

this is what we got from chatbt. It gave

us the same kind of to-do list and the

output here. And it was really actually

quite quick. So let's take a look at the

changes. So if I jump over here and

click try on now, we get this nice

overlaid slide on from the left. And

looking at the claw version, we have an

overlay here for tryon. And but then it

completely blackens out the background

here. I think I prefer the GPT5 version.

And yeah, this is looks okay. It's not

too bad. And let's close out of that.

Then I'm left stuck here. and I don't

know what to do. So, a little bit of a

fail here from Claude. Okay, so I put in

my prompt here. Let's remove everything

from the slide. You can pause here if

you want to take a look at what's there.

And actually, let's just crank this up

to high reasoning effort. We are using

Claude Opus uh in Claw Code, so it needs

to be a fair comparison. Okay, so both

agents have finished. Let's take a look

at the outputs. So, if we look at the

ChachiPT version, I click try on. And if

we try and click upload here, and then I

click try on outfit.

Perfect. or change photo, I can upload a

different one. Okay, perfect. That makes

sense. So, let's get rid of that one

now. And let's look at Claw's version.

So, if I click try on here. Okay, we can

drag and drop in our image.

Click open and then try on outfit. So,

yeah, pretty comparable. Maybe I like

this one a little bit more. Okay, so the

other really big release this week has

been the release of Nano Banana, aka

Gemini 2.5 flash image. So, this model

was being tested under the name Nano

Banana for a week or so before it was

actually released and announced that it

was coming from Google. So, it's Gemini

2.5 flash image and it's actually

available via API so that we can use it

and it's really great at product

placement, image swapping, all that kind

of stuff. And you can see it's ranking

really high in uh LM arena compared to

other imagebased models. You can get

access via Google AI studio to give it a

whirl or via Gen Gemini or through the

API. So we just need to go and get

ourselves an API key if we want to work

with it in any kind of an app that we

create. So go to get an API key. It

gives you some quick start guides here.

Go create an API key that you're going

to copy. And we need to copy this into

ourv environment variables so that our

app can use it. So over in our project,

we're just going to create a new file

and we're going to call it env.local.

And in there, we're going to paste in

our API key. So the other thing I'm

going to do is copy the API

documentation directly from Gemini. I

could use something like context 7, but

I just find this a whole lot more

reliable. It's up to date and it's

exactly what I want. So a quick Google

search will give me this. And then I'll

add that as part of my prompt. I just

paste that in there. So, if you can see

my prompt now, I'm saying I want to use

the Gemini Flash 2.5 image preview model

to take an output. Use the prompt to

place the output on the person. I've

added an API key as Google Gen AI API

key. So, you see I've added that in here

to env. And the reason I'm telling it

what the key name is here because um the

model can't see into any files for

security. And then I'm just giving it a

link to the uh docs as well. And in this

case, I'm just going to switch it over

to agent full access so it can actually

browse the web in this case. So, okay,

we're in a good spot. And oh, I must

actually save where I am. And I'm going

to do exactly the same thing with uh

Opus.

Okay, so it looks like Claude finished

first. So, let's go and check that out.

So, with Claude, it doesn't do anything

at all. And I don't get any kind of

input from the terminal to say what's

wrong. And the same when we look at uh

codeex, we get a Gemini request failed.

So I'm going to have to add a little bit

of logging to understand what's going

wrong in both cases. When you're getting

these vague uh server errors like a 500

and 502 and there's no further

information, what I would normally do is

just go and ask the model to add some

logging in the terminal so we can see a

little bit more about what's going wrong

and that will help me and help the model

figure out how to correct it. So I'm

adding a little prompt in here and I'm

going to run that. The problem with

claude is it's actually calling the

wrong model. It should be calling the

preview version. And if I look at codeex

version, it's actually calling the

almost right version of the model, but

it's not calling preview. So, I'm just

going to correct that in both cases. So,

two things I had to change in order to

get it working. It seemed like it only

works well with images where the face is

obscured or you're not seeing the face.

Therefore, it doesn't trig any trigger

any kind of warnings with the Gemini

Flash model. And then I also did a

little bit of a cleanup to increase the

sizing of this here. So let's try and

upload our picture here and the model

here. You'll note I had to remove the

face in order for it to work. And boom,

here we have the outfit. Now for the

caveats, this didn't work for me all of

the time. The problem is the Gemini

model is heavily censored and anything

got to do with swapping clothing is

going to fire some kind of censorship. I

had to apply some tricks like making

sure the face was obscured so that it

would work well. And even at that, it

only worked every couple of generations.

It gave me a refusal. But the good news

is that this is now possible via a

model. And I think in the coming weeks,

if not months, we're going to see other

providers launch their own versions.

They might have open weights. They might

be uncensored in a way that will allow

us to do better swaps. So maybe

something to experiment with in terms of

product swapping, uh, thumbnail

generation. Lots of different ideas you

can apply using Nano Banana. This could

be a really great Shopify plugin. You

could resell lots of different potential

here. So, after a couple of builds, how

am I feeling about all these new tools

from OpenAI? Well, I'm actually really

quite impressed. I think OpenAI have

really been cooking here, and they've

played a lot of catch-up. The next thing

is the CLI. Now, the CLI at a starting

point, it's still pretty basic, but it

uh is working quite well for me and has

some nice features. I can see within a

couple of weeks, a couple of months, it

might hit feature parity with the likes

of Claw Code, making it really quite

compelling. Then we have the

introduction of the extension into VS

Code or Cursor or whatever IDE you want

to use, which in my usage, as you've

seen in the video, is actually really

quite good. Um, what I love is it's one

tool across all the spectrum of needs

that I have. An extension, a CLI, and

some web agents. What we're getting from

Enthropic right now is a really great

CLI in the form of claw code and the

great model series in the likes of Opus

and Sonnet 4. However, the pricing is a

big consideration here and OpenAI is

just so much cheaper than Entropic for

developers. And if you're in an

enterprise scenario where you're buying

multiple seats or you're a hobbyist and

you can't afford these big bills like a

200 max plan, OpenAI really is a very

viable option. But in terms of getting

the job done, both seem to have a parity

for just getting there. They do things

in different ways, but in terms of my

speed to reach a conclusion in my tests,

Anthropic and Claw, Antropic and OpenAI

are getting me there pretty much at the

same time. So, I like to be opinionated

on this channel, and if I was only going

to be able to pick one at this point in

time, I would probably go with OpenAI

purely around the pricing and the spread

of offerings that they have. Now, that

being said, I'm sure Anthropic is

thinking about its own extension and its

own cloud agents that it's going to

bring to the market as well. But will

they be able to match pricing? And the

other thing is there are lots of other

different models coming out like I

showed previously that are really quite

powerful like the Quen series, we've got

Groth, we've got so many other models

that are going to show up like DeepC

that are going to be much cheaper to run

than both OpenAI and Anthropic. And

that's where a tool like cursor might

suit you because it offers all of the

things that we just talked about. It has

background agents to run in the cloud.

It's connected to your GitHub so it can

review your pull requests. It has its

full-on IDE that is really

state-of-the-art. And it has its own CLI

which is a little bit buggy but is

improving all the time. But the big

thing here is I can select any different

model that I want. But generally how it

works is you spend your $20 and

depending on what model you're calling,

you get charged at that API's cost. If

you've got $20 of usage and you're

running a lowcost model, it's going to

go a hell of a lot further for you.

That's where it stands at the moment.

It's still really early stage. All these

tools are working in exactly the same

way. Their CLIs work in the same way.

Their agents are working in the same

way. You learn it in one tool and it

seems to apply across all the others in

terms of feature parity. So, I wouldn't

be too worried about picking a wrong

tool at the moment. Just play with them

all or pick one and work with it as much

as you like. And then if something

better comes into the market at a later

stage, the switching cost isn't going to

be that high. We're just having a ton of

fun playing with all of these new tools.

So, I'd love to get your opinions on

what tool you're currently using and

where you're having success in the

comments. It's really helpful to me and

to others who watch the channel. Thanks

and see you next week.