Loading video player...
Hey everyone, welcome back to the Asian
Factory, the podcast where we go beyond
the hype and dive into some of the more
technical AI details in the world of AI
right now. Today we have on the show
Paige Bailey and I'm super excited
because she's from the DeepMind team.
Paige, do you want to give a quick
intro?
>> Yeah, so greetings everyone. My name is
Paige. I'm the UTL for our developer
relations team at DeepMind. uh which
means that I get to work on the Gemini
APIs as well as things like Bio, Nano
Banana, Gemini, uh Gemma, uh all of the
things that we've released for AI
developers to use in their projects.
>> So, I know Google's been up to a lot
recently and last week Google just
launched a bunch of new tech. Do you
want to give us a quick rundown?
>> Yeah, sure. So, so I think I have a
slide actually to show kind of the uh
how much we've been shipping over the
course of the last while. Um the and
this is just going up to May 2025. So
obviously like shipping continues uh
shipping continues until models reach
AGI I suppose or ASI. Um but we've been
releasing a new model or a new feature
about every single day. Um sometimes
even more than one a day over the course
of the last year. Um the latest model
releases are really Gemini uh 3. So we
released our Gemini 3 Pro model which
everybody was very excited about last
Tuesday. Um and then also uh Nano Banana
Pro which is kind of the next iteration
of our NanoBanana series of models. Um
really really excited about both of
these. And then we also have a new IDE
for developers called anti-gravity which
is released by uh by some of the folks
that we have in deep mind who had uh
previously been working on the wind surf
team. So lots of goodness coming out for
every single part of the stack for
developers and for people who care about
gen media.
>> Yeah, I can't imagine how busy the deep
mind team has been up to trying to
release models. It seems like what you
like you said once a day. Um, and
today's November 24th, so if you're
listening to this after, of course,
these models and releases have come out
in the middle of November. Uh, and it's
been kind of taking the AI world by
storm a little bit. Uh, I remember last
week when Gemini 3 was announced, I was
really excited to try it out. And then
they launched anti-gravity. I saw the
blog post right after
>> and then the day after that was Nano
Banana Pro and it was a lot. It was it
almost felt like Christmas Day. So,
>> yep.
>> Yeah,
>> strong agree. And so, so I guess it
might be good to talk a little bit about
Gemini 3, what it is, was it, what it
isn't, um, and some of the things that
folks can can build with it. Uh, does
that sound good?
>> Yeah, I would love to hear.
>> Heck yeah. So, so Gemini 3 is really
intended to be the next iteration in our
Gemini series of models. Um, I I think
the I've certainly been excited about
it, especially for doing things like app
development, which we'll see in a
second. Um, and it's available in all of
the places. So, you can use it in the
Gemini APIs and the Vert.Ex AI APIs. Um,
places like Gemini CLI, Android Studio,
the Gemini app, AI mode, jewels, they've
all been kind of replatted to use this
third series of models. Um, and I I
think it's the natural evolution for for
Gemini at large. like the first Gemini.
Um, so I was I was on the original
modeling team for both Palm 2 and for
Gemini. And I think that model was
really all about being able to
understand a lot of different kinds of
content. So being unable to understand
video and images and audio and text and
code and all of the above all at once.
Um, Gemini 2 was really about thinking.
So we introduced this concept of
reasoning of being able to plan of being
able to kind of think step by step
through processes and have these
detailed thought traces and thinking
tokens. And then the third iteration of
Gemini is all about like getting much
much better at code and also being able
to act on things. Um uh so we've made
huge improvements to tools use function
calling um which are great for agent
style workloads but also being able to
incorporate things like tools into the
reasoning process. Um so you can have um
these kind of more composite
architectures of different models
working together, different models
interacting with systems to accomplish
tasks um as opposed to just one model
trying to do all of the work all by
itself. If that makes sense.
>> Yeah, this is really well explained. It
almost like you said sounds like the
natural evolution between 1, 2, and 3.
And I would imagine the Gemini 3 so good
at being able to do things like coding
because of the data set maybe that
you're training it on and its ability to
action on whatever it's been reasoning
on which a lot of coding is, you know,
basically what it is, right? Action,
taking actions, writing code, changing
functions, so on so forth.
>> Absolutely. And and to your point, like
there there are really two parts of the
model building process, right? there's
pre-training and then there's
post-training. Um, and pre-training is
all about giving Gemini as many many
tokens as it can possibly have access
to. So, so getting it uh like the
entirety of the internet, like we're
creating a lot of synthetic data sets um
that uh that the model can use to to
help accomplish its work. So things like
video game footage, um things like uh a
lot of synthetically generated code that
get then gets run and so you can kind of
get an idea that it's it's actually
doing something useful with descriptions
of what it does. um all of that kind of
getting put into post-training and then
um or or pre-training and then
post-training is really all about giving
um much more uh kind of u sort of
specific examples of kinds of use cases
that the user would be expected to take.
Um so having really really handcurated
examples of people using a whole bunch
of tools to accomplish a task. um lots
of multi-turn conversations uh that
include edits of websites um lots and
lots of those kinds of things getting
added into our into our post- training
mixture in addition to techniques like
reinforcement learning. Um so so I think
that uh there's a lot of really really
hard work that's been done across tens
of thousands of people across Google um
to get the models into a much better
state and it's it's been really
inspiring to see honestly.
>> Yeah. Interesting. So this pre and post
training process is almost like sending
Gemini to school to learn the basics and
then the post training is almost like on
the job experience where they get to
learn what the workflow looks like so
they can then replicate that. Very cool.
>> Yeah. And then the the other thing these
are just kind of the benchmarks um for
for the model itself. My favorite one
though is uh my favorite one is
something called vending bench um which
is vending bench for folks who are
uninitiated is gauging the model's
ability to run a passive business. Uh so
so uh h I I'll add a link to the vending
bench 2 website um so everybody can see
but but basically you're saying hey
model your job is you run this vending
machine you need to figure out what we
should stock it with you need to uh
whenever you realize that something's
about to go out of stock like you should
reorder it. Um you should deploy a
person to restock the machine whenever
the order comes in. And you should also
do long range planning to make sure that
as you're like handling all of these
component parts of the business um that
uh you know it's able to have as much
uptime as possible. So like I don't come
to the to the vending machine looking
for pretzels and find pretzels are
missing. Um but uh but I'm able to to
kind of have those expectations when the
when the vending machine is running. And
then the way that the models are gauged
in terms of performance is how much
money they make over the course of a
year. Um, and which which is awesome.
It's like the models ability to run a
passive business. Of course, you would
you would gauge it by how much money it
actually makes over time. Um, and right
now Gemini 3 Pro is coming in at around
$5462
per vending machine, which, you know,
like, um, given that we're operating on
this exponential trajectory, maybe we
can all retire and run passive
businesses at some point in the future.
But it's it's been really exciting to
see how the models have improved on
these on these things that require a lot
more strategy and a lot more decision
making, especially long-term decision
making um for these kinds of use cases.
>> Yeah, I just sent you an invite for a
meeting after this uh for a vending
machine entrepreneurial sync you and I
can
>> Excellent. I I really so so like like I
really really want uh ju just kind of
like as a segue I really really want
somebody to create an eval for uh like a
laundromat bench
>> um because I I do feel like these these
kinds of passive businesses
um have immense potential and a lot of
people who are who have built um AI
startups AI companies just recently they
have a few operating just kind of more
or less autonomously they won't have to
step in very often. Um, but there's
still like pretty significant revenue
streams every single month. It might not
be a billion dollars, but it's certainly
enough to uh to kind of meet the needs
for users and to to kind of take care of
take care of some of these tasks.
>> Yeah, it really allows people to bridge
the gap, I find, and unsolicited Google
non-certified Google advice, but I think
laundromats and vending machines tend to
be some of the most secure and good
entrepreneurial beginning ventures, I
guess. So,
>> absolutely. Maybe we should get a
vending machine. That would be kind of
fun.
>> Wait, put it full of Google swag.
>> Cool. And then the uh so additional
additional kind of benchmark things.
It's the first model to break 1500 on
Elmarina. Right now we come in at around
151. It's also doing really really good
at webdev arena. Um, so the new feature
that you might have seen in Replet,
that's their design feature uh to build
beautiful UIs is powered by Gemini 3.
Um, and you can really really see a
difference in the quality of the apps
that get generated with build and AI
studio. Um, and then it's also kind of
state-of-the-art at reasoning and
multimodality at tools use and planning.
Um, if you're a Minecraft enthusiast,
it's much better at these kind of voxal
art experiences. So you can see the
difference between Gemini 2.5 Pro and
Gemini 3 Pro um for for uh these these
kinds of voxal art um demonstrations.
Um really really good at making games
including games with interesting game
mechanics and kind of beautiful
beautiful um beautiful UI.
Um and it feels like so it feels like
we've been talking about it for a while.
Would you be down for me to show how to
use it in in AI Studio as part of our
build feature?
>> Let's build.
>> Excellent. Let's do it. And I I think
both of us, we were having a
conversation at the very beginning like
both of us less of a fan of slides and
more of a fan of like showing stuff in
action. So like let's show some stuff in
action. I also like one last thing
though before I pop over. Um it's
currently available in the API. So you
just have to change out the model doc
string. And then also if you've been
using uh if you've been using OpenAI as
part of your projects um we have
something called the OpenAI
compatibility layer um which allows you
to swap out a few lines of code. So this
API key and base URL um you would just
modify it to be your Gemini API key and
you would swap out the model name. Um
but then you would be able to run it
with your OpenAI uh your OpenAI code
hopefully without any changes. Okay,
cool. I've uh taken up enough of your
time, but I want to take up even more
now. Super excited for building.
>> Cool, cool, cool. Let's let's do a
thing. So, let me zoom over to AI
Studio. Um this is the build feature in
AI Studio. If you expand out this
section on the left, you can see that
there there's a lefth hand pane. Um you
can select the build feature. Um, and
then it immediately puts you into kind
of this build uh this build environment.
I'm going to collapse this so there's a
little bit more real estate for us to
see what's going on. Um, and then
basically you can just describe anything
you would like to create. Um, you can
also see uh kind of an app gallery with
remixable uh remixable apps, things that
the team has created, including a comic
book style experience, a product mockup
visualization. Um, lots of games and
immersive 3D worlds, including like
simulation games, which are pretty cool.
Um, and you can fork these, remix these,
uh, share them with your friends. Um,
we'll also show how to uh how to share
them with Google Cloud um in just a
second. Um, and then also if you can't
think of something to build, if you
click I'm feeling lucky, it gives you
kind of a out ofthe-box ready-made app
idea that you can incorporate into your
projects. Um, but what I've been trying
to do is I've been trying to like build
um increasingly complex apps uh with the
with the goal of trying to make it so
that Gemini can't build it and it has to
to kind of fix itself along the way. Um
uh because we do have an autofix uh an
autofix step within AI Studio. It's just
that it's triggering much less
frequently now because the models are
getting so good. Um, so I'm going to I'm
going to try to see if we can build
something that um builds something that
is sufficiently complex that Gemini will
have to fix itself. So let's let's try
this out. Um
uh cool. Create an app that uses the
webcam and the microphone
um to have a conversation with a user.
Um the app should uh the app is intended
to be kind of a um what it should be is
kind of like an insurance
um an insurance app. So the user will
show them uh the user will show a bunch
of different objects. The app will uh
sort of catalog all of the objects, like
describe them, put down their name, um
do an assessment of what kind of wear
the object might have, and then also use
Google search grounding to try to ground
um with how much that uh that item might
be worth. The app should put that in a
table format um so the user can see all
of the all of the objects that they've
been cataloging um through the duration
of the video conversation. Um, uh, the
app should also be really well-designed,
should show some stats about all of the
objects,
um, uh, that the user that the user has
put down for insurance purposes. Um, and
the app should also have,
um, the app should also have like a
Nordic theme, so it should feel like
very IKEA vibes. Um,
and uh, for the video conversation, the
app should be using using Gemini Live
along the way.
So, so that was just like spitballing a
whole bunch of information. Um, I'm I'm
not sure uh I'm not sure how well it'll
do, but let's try.
>> Um, yeah. And then, uh, so what's
happening here is you immediately get
put into something that looks like an
IDE. Um, if I expand out this thinking
section, you can see that Gemini is
figuring out what it should be doing.
It's figuring out how it should be um,
doing function calls. I said the Gemini
live API and it's figuring out what
model name that should be, figuring out
what the structure of the data should
be, if it's supposed to be cataloging
all of these items. You also see uh,
kind of some recommendations about what
to do with the app. here in the center.
Um there's a place up in the right where
you can save to a public or a private
repo on GitHub. Um and then there's also
a place where you can add files if you
would like. Um add files. So add things
like images or text files or PDFs. Um
and as Gemini is working, you can see
all of the code getting getting created
along the way as well as the prompts
that are created to give to each one of
the each one of the models. Um the I I
love asking Gemini to write the prompts
for Gemini just because usually it does
a much much better job than I can do. Um
all of the files are getting put into
this really nice directory structure
here in the center. Um and the nice
thing too is that since this experience
is baked into AI Studio, it has insight
into the latest models, the latest
features, um what's possible with the
APIs. Um right now it's generating a
React app. You could have also specified
Angular. Um, it's using React Native, so
you can access all of these apps via
mobile phones as well. Um, uh, but this
is, uh, this is really really cool to to
kind of see and to just have an idea,
um, and to figure out, uh, to figure out
how to get it get it started without you
necessarily having to wade through
documentation or figuring out which
models to use for which experiences. I
also quite Yeah, go for it.
>> No, I was just going to say so you threw
your messy thoughts into this thing and
obviously it's very impressive that it's
able to not only format but format your
thoughts into something that you know is
a bit more coherent but also write the
code. I think the most impressive part
is the name that it came up with because
I know that you didn't I know that you
didn't specify that.
>> Yeah. Nordic Nordic Shield. So So what
we we see an app uh we see an app. So,
I'm going to try to see um how well how
well this does just out of the box. Um
hey there.
Let's see.
>> So, yeah, that's that little autofix
button, I guess, probably.
>> Yeah. So, so it looks like uh just
taking a look at the app itself.
Um, it t taking a look at the app
itself, it was able to hook in the
webcam.
>> Um, it even gave me the options of
turning on or turning off my video and
my camera or uh my camera and my audio
rather. Um, it created the inventory
list which is pretty cool. Um, it has
some condition breakdowns, shows uh
shows connections. Um, but then it
didn't pick up on my audio which often
happens whenever you have multiple audio
sessions simultaneously. Um, so I think
what it's doing is it's figuring out
where um it would uh like which section
it would need to uh need to figure out
to attend to. Um, and then that would
trigger the Gemini live call. And so
there's a big reasoning trace on the
left um which shows kind of like all
right, is the mic on? Is it enabled? How
do I toggle the mic? How do I understand
the media stream? Um, and then it should
be able to understand that as part of
the website as well. Um, yeah, I also
like I I feel like it's it's a much
better design than I would have been
able to come up with myself. I also
really like that it automatically
included instructions. Apparently, this
person that I or the the entity that I'm
talking to will be called Sven, which is
also very Nordic. Um, and uh and we'll
see we'll see how well it's able to make
the changes. I mean you you did ask for
IKEA. So
>> this is this is true you know like it's
the and and I think the one of the nice
things too is um you know I started
doing machine learning around 2009 so a
really long time ago. Um but I've always
been more of like a Python human than a
JavaScript or TypeScript human. I I
built um I worked on part of the VS Code
team for a while. So I I've built like
VS Code extensions in Typescript. Like
I've built small websites. Um but I
would not call myself a TypeScript like
expert by any stretch of the
imagination. So having this help um
figure out where the problems might be,
where to debug it, how to debug it, like
what libraries to include to be part of
a modern stack is really really helpful
just for me personally. Yeah, I was
about to say like it's it's quite
impressive that it not only recognized
that there was an error,
>> it parsed that error, made a reasoning
state for you and and allowed you to
kind of have some level of
interpretability into its thought
process, which I think is like so
important because obviously if it's
doing something that's kind of not what
you intended, you can easily like pull
the plug if you will, which is doesn't
seem like it's the case here, but you
never know, right?
>> 100%. I also really really like if you
zoom in to the bottom um you can see
like warnings uh insights into the build
process. So rendering start rendering
end um uh recommendations on what to do
to change like the ergonomics and the
accessibility of the of the website
itself. So like hey this formatting
might not show up as well on like a
mobile device. Um and then our design
team has also implemented a feature
called annotate. So if you click
annotate app, what you can do is you can
circle features, you can add comments.
So say things like um I don't uh um make
this cataloged items
uh section more
attractive or similar. Um, and it's able
to to kind of take the comments and make
changes and modifications to the
website. Um, the same way as you would
work with a designer to kind of do red
lining against the uh against the UI
itself.
>> Um,
>> this is really nice too, I think,
because like it looks like it's still
working on trying to fix the bug while
you're able to annotate. So, you can
almost do two things at once. have it
like work
>> um and then also have it like um you
know annotate your designs or review
your designs.
>> Yeah. And I love I love also like there
there are a couple of different agents
that we have or I suppose a few
different agents that we have um as part
of our our Google AI stacks right now.
So we have Jules which is an
asynchronous agent, Gemini CLI which is
kind of a synchronous agent that you can
operate with uh within a terminal. Um,
but it's really really nice to have, you
know, a team member go and attempt to
accomplish a task and debug it along the
way. Um, so let's see. It looks like
it's made some changes. I'm going to go
ahead and try again.
Hey there.
Hello there. I am Sven, your insurance
cataloging assistant. Please show me the
items one by one and I will help you
catalog them.
>> What about this?
I see a dark colored smartphone in a
case with a ring grip. It looks to be in
good condition. Would you agree with
this assessment? Sure. It's a Pixel 7,
but you might also see that it has like
some scuffing and like a cracked screen
if I turn it around.
>> Ah, thank you for clarifying. I see the
scuffing and cracked screen now when you
turn it. Based on this, I would say it
has fair condition. Do you agree? Great.
I have cataloged the Pixel 7 smartphone.
Is there anything else?
>> What about this?
>> I see a Diet Dr. Pepper can. It looks
like it is in new condition. Shall I add
this to your catalog?
>> Well, how much do you think it would be
worth?
>> I can't estimate value myself. I can
only catalog the item with its name,
description, and condition. Would you
like to add the new Diet Dr. Pepper Can?
The Diet Dr.
>> Let's do it. And then also, how about uh
how about this computer screen that you
see in the back?
>> I see a large screen on the wall with
some images and text. It looks to be in
good condition. Would you agree?
Understood. I have cataloged the
computer screen. Is there anything else
you would like to catalog today?
>> Nope, that's it.
>> Thank you for using Nordic Shields
insurance cataloging service. Have a
good day.
>> Cool. And so you can see uh let me zoom
out a little bit so folks on the call
can see as well. Um it looks like there
were two phases. So as we were talking
about these different uh these different
items um the uh live API feature showed
them on the inventory list. So it gave
the item name, the description, the
condition, the estimated value. Um but
uh as part of the estimated value step
um there was a second kind of follow-up
action uh that did a search around
Google to pull in like what would the
estimated value of this thing be? Um so
so the live API uh the live API kind of
assistant was just responsible for
cataloging and then after she added uh
an item to the table to the inventory
list that kicked off a second step. So
almost like an agent to go search um
round up uh a bunch of different items
and then give us an estimate for how
much these things um how much these
things might be worth which is pretty
cool. Um yes. So it was uh and it also
added links um like citations for those
different uh for those different uh uh
items as well, which is pretty cool.
>> Yeah.
>> Yeah. I uh I think like this is probably
one of the the the nice new things about
Gemini 3 or some of the APIs now that
allow for these types of models to
always be grounded in in search, right?
Um, I think something like this would be
helpful for like decluttering my space.
I sell a lot of stuff on Facebook
Marketplace. So,
>> yeah. Excellent. So, so like and I I
think it's it's also nice to help with
um kind of detailed descriptions of all
of the all of the different items. Um,
and then if you wanted to deploy this
app uh via Google Cloud, you could
select a different cloud project, um,
click deploy and it would give you a
unique URL with logging um, and and kind
of insights uh, into all of the the
different features that you might be
including um, as part of your project.
So, if you uh, like as an example, I'm
going to show my personal Google Cloud
um, my personal Google Cloud account.
There we go. Excellent. So, so this is
uh this is the logging feature in Google
AI Studio where you can see um uh if you
turn on logging all of the different
kind of JSON blobs that come in from
your your app interactions. You can see
things like usage and billing. So, the
number of API errors and requests that
you get per day. Um as well as kind of
any rate limits that you might be about
to hit. So, it looks like I've been
using VO3
um uh past my rate limit uh past my rate
limit options for at least the tier that
I'm on. Um and then you can also see the
different projects. Um and right now I
have around 85 keys created for
generative language client. Um but it's
it's cool to to be able to have a
one-stop shop where you can see all of
that usage, which is quite nice. And
just to give Nano Banana its time in the
spotlight and to show a couple of
examples that people have built with it,
um you can do everything from kind of
use a Pinterest style collage of items,
um compose them into a single image. Um
you can do things like build uh
orthographic blueprints for real life
places. So in this case, a castle um
being able to see No Feinstein castle
from different uh from different views
um with high reliability for all of the
text. Um you can create detailed physics
explainers and all of this is grounded
using tools including things like
grounding with Google search which is
pretty amazing. You can also change the
resolution, so 1K, 2K, or 4K. And then
also the aspect ratios as you use uh
these generative media models as part of
your projects. So, it's been really
exciting to see what people create.
Everything from uh everything from kind
of re-imagined game experiences to like
detailed posters and slides for their
papers and their projects at Nurips.
>> Yeah. Yeah. And I think a lot of the
creativity and power, I guess, gets
unlocked when you even combine all of
these things together. So, you can ask
Gemini to go and do research for you on,
I don't know, skiing areas in across
North America, across a train line or
something like that. Uh, pull all that
information, create an image, and then,
you know what I've seen people do? take
that image, put it into VO3.1 and they
get an animated explainer which then
they can use as a part of this slide or
presentation or something like that
which is very cool.
>> 100%. When you combine reasoning with
multimodal outputs like the the
possibilities are really endless and
it's so much easier to understand things
like images or video um I think than to
to have people kind of comb through just
raw text or raw code. Um so it's been
really exciting to see.
>> Wow. Thanks so much for showing all of
this to us, Paige. It's been really
insightful. I want to switch gears a
little bit and I want to show you one of
the projects that I've been working on
with Gemini 3. Maybe you can critique me
and let me know what I'm doing wrong or
what I'm doing right and what you think
about it. This is my website that I'm
using to share resources and articles
with people. As I've been making more
short form video content online through
Tik Tok, Instagram, YouTube Shorts, and
such, it's been kind of challenging to
find a way to share some of the
resources that I talk about with people
who are viewing my content. And so, you
can see here that I have a lot of URLs
that I often reference, including Nano
Banana Pro, Gemini 3, and so on. Uh, but
I don't know or have a good way of
sharing it. So, I created this website.
I threw these ideas that I had for a
more like doodly vibrant pastile neo
brutalistic, you know, notebookesque
theme into AI Studio and it created this
for me. It's a nice place I can use to
like manage all of my content, create
new assets for me, save it, and then
import it into my videos. The problem is
is that my previous website obviously
doesn't look anything like this. So I
thought maybe what I could do is combine
Google's new IDE anti-gravity and Gemini
3 to um inherently take advantage of
that multimodality that we spoke about
earlier. Take a screenshot of this uh
new website, feed it into the code for
my old website and see what it comes up
with. What do you think?
>> Yep, let's do it.
>> Okay, so Paige, here we are in
anti-gravity, which is meant to be an AI
native IDE from Google, and it has
Gemini 3 Pro baked right into it. So,
what I'll probably do is take those two
screenshots that I was um that I was
talking about before and feed them into
here and then just ask if if
Anti-gravity or Gemini 3 Pro can come up
with a way of redesigning my site to
adhere to this a little bit better. So,
I'm going to grab those two screenshots.
So, I'll ask it recreate this site to
adhere to my new design philosophy,
which you can see in the included
images. So, go ahead and run that. And
while that's running, I think it's worth
noting that obviously Gemini 3 Pro's big
job here is to not only interpret what's
inside of the image, but kind of break
it down into the most notable features
that it needs to then go and apply to
coding. within anti-gravity. So while
it's doing that, it obviously has
explored a little bit already and it's
created this task list here. So explore
the existing codebase and a little bit
more. It'll follow that task list to
continue ensuring that it stays grounded
and true to that kind of process of
development as time goes on. Uh so the
next thing it'll do is obviously analyze
the new design and then set up the the
new design system. Yeah. And so the next
thing I came up with here is this
implementation plan which I also find to
be really nice because through this
process it's creating a bit of
documentation which is nice because down
the road when the codebase becomes
larger and the AI has to continue
sifting through more and more code uh it
can continue to reference through things
it's done in the past. So you can see
this implementation plan it has these
actions like new modify and so on. Uh
and it keeps it from you know um not
going off the rails as much.
>> Awesome. Paige, have you created
anything cool with Nano Banana Pro?
>> Oh, yeah. The uh So, so with Nano Banana
Pro, I've been experimenting quite a bit
with different uh with different
outputs. Let me show
um let me show one from a little while
ago. The quality is just absurd.
Um, as a as an example,
I I created a bunch of uh let me share
screen really quick.
Um,
>> oh, nice.
>> I created a bunch of images with uh
popular like physicists and computer
science figures like Claude Shannon,
>> Richard Fineman, and Carl Sean. And then
also one of our winning hackathon
projects this past weekend was using was
using Nano Banana Pro for its game asset
generation. So I'm not
>> Yeah. So they used it for every single
one of the game assets as well as the
design patterns
>> for the games themselves which was quite
cool. This was at the AI engineering
world's fair. And how is your how is
your website coming along?
>> So it looks like we're done here. What
I'll probably do is pop open my terminal
and then run the app and then we can
take a look at what it looks like.
>> Yep.
>> Cool. Yeah. So, Paige, this is what it
came up with. You can see here that it
actually followed the brand guidelines
pretty well. It has these pills that
kind of jiggle when you hover over them
and it has like some more information
about my resources, prompt library,
agent blueprints, and things like that.
So, I found it did a pretty pretty good
job just like with one prompt. What do
you think?
>> Oh, it looks awesome. Thank you, Gemini
3 and anti-gravity. I can't wait to
continue playing around with this tool.
>> Very, very cool. It's I I'm really I I
love the the kind of attention to detail
and the creativity that it puts into
these UI designs. I feel like it's
better than ever at at kind of creating
these these kinds of experiences.
>> Paige, it was a pleasure having you on
the podcast today. Thanks so much for
coming and explaining a little bit more
about Gemini 3 and all of the great
things that Google launched last week.
>> Absolutely. Thank you so much for having
us. And I loved that we were able to
talk about not only just the models
themselves, but also how Google's entire
AI stack from the hardware to the
compiler to the machine learning
framework to our endto-end app
deployments. Um, even things like
anti-gravity and AI Studio and all of
the other surfaces that we have around
Google help make sure that we're
building the best possible models for
all of these different use cases. It was
really fun
>> and not to mention all the cool things
you can do and deploy them to Google
Cloud and have them ready in production.
So, thanks so much again. I'm EMTT
Mirage
>> and I'm Paige Bailey.
>> Powering down.
Gemini 3 is here, and it is evolving the way developers build. In this episode of The Agent Factory, host Amit Maraj sits down with Paige Bailey from Google DeepMind to break down the latest rapid fire releases from Google, including the highly anticipated Gemini 3 Pro, Nano Banana Pro, and the new AI-native IDE, Antigravity. Gemini 3 represents a massive leap forward, moving from multimodal understanding (Gemini 1) and reasoning (Gemini 2) to genuine agency and acting. It excels at coding, tool use, and long range planning—as evidenced by its strong performance on a variety of agentic benchmarks. Listen along as Paige builds "Nordic Shield"—a multimodal insurance cataloging agent—live in Google AI Studio using the new build feature and Gemini Live. As a bonus, Amit demonstrates how to use the Antigravity IDE to redesign a website using nothing but screenshots and Gemini 3’s multimodal capabilities. Chapters: 00:00 - Intro & welcome Paige Bailey 00:42 - Google's new AI releases 02:49 - What is Gemini 3? (The evolution from 1.0) 06:48 - The "Vending Bench": AI running a business 11:21 - Demo: Building "Nordic Shield" in AI studio 18:15 - Debugging & "Auto-Fix" in action 21:40 - Using the annotate feature for UI design 25:17 - Gemini Live & Google Search grounding 28:34 - Nano Banana Pro: State of the art Image Gen 30:27 - Demo: Website redesign with Antigravity IDE 35:41 - The full Google AI stack 🔗 Resources & links mentioned: ➖ Google’s AI Studio → https://goo.gle/4ixTHR9 ➖ Deploy your Apps to Google Cloud Run → https://goo.gle/4pdDyCX ➖ Google’s Antigravity → https://goo.gle/48m2pNM ➖ AI Development Documentation → https://goo.gle/3M0yjYB ➖ Vending Bench 2 → https://goo.gle/3K13MJI ➖ Deploy on Google Cloud → https://goo.gle/4oq7Elv ➖ Learn more about Vibe Coding and Context Engineering → https://goo.gle/4iHlOh3 Connect with Amit: ➖ X → https://goo.gle/3Ki0lhC ➖ TikTok → https://goo.gle/44ogN6S ➖ Instagram → https://goo.gle/4reQFoO LinkedIn → https://goo.gle/3M2KV1t Connect with Paige: ➖ X → https://goo.gle/4odgB1j ➖ LinkedIn → https://goo.gle/3KrHAbE Join the conversation on social media with the hashtag #TheAgentFactory. Connect with the community at the Google Developer Program forums. → https://goo.gle/4oP9bmb Watch more Agent Factory → https://www.youtube.com/playlist?list=PLIivdWyY5sqLXR1eSkiM5bE6pFlXC-OSs 🔔 Subscribe to Google Cloud Tech → https://goo.gle/GoogleCloudTech #Gemini3 #Antigravity #GeminiCLI Speakers: Amit Maraj, Paige Bailey Products Mentioned: Gemini, Antigravity, Gemini CLI, Gemini Live