I Let an AI Control My Browser to Play Tic-Tac-Toe - LangChainJS Tutorials | DailyDevLists

Loading video player...

Full Transcript

1,905 words • EN

Hi, this is Christian from Lchain. Many

LM providers now ship their own native

tools. Not just generic function calls,

but tools the model specifically trained

and tuned to work with like entropics

computer use, web search, bash or memory

tools. With the latest version of lung

chains, entropic and open air provider

packages, we expose these simple

provider tools so that you can call

these model optimized tools seamlessly

and type safe with your agent without

hand rolling any JSON schemas or glue

code. In this video, I will show you how

you let your model take control of your

browser to play tic-tac-toe against you.

I'm using Entropics computer use model

and its native tool wired through web.io

to give the AI real control of my

browser. So, it can literally play

tic-tac-toe against me on the screen.

The coolest part about this is that

these provider native tools are

interchangeable. So, as we play, I will

bring in Antropics memory tool, for

instance, to help the model keep notes

on previous games, tracking mistakes,

pattern, and strategies so that over

time it actually gets better at beating

me, all while staying fully typed and

easy to use with Flange.js.

Let's check it out. Now if we take a

look into the entropic docs we will find

a whole section on tool use with cloud.

In here we'll find all the native

supported provider tools that entropic

provides. Tools like the bash tool, the

text editor tool or the web search tool.

And all these tools already have been

possible to use with longchain. You were

always able to pass in this raw object

into your bind tools method or into the

create agent in order to use them. But

with the new entropic and open air

provider packages, h you now have a new

tools primitive that you can directly

import from these packages. This tools

primitive has registered a bunch of tool

actions that you can use to register

your tool. For instance, for the web

search tool, in order to give your agent

access to the internet to and

capabilities to search on the internet,

all you need to do is to pass in the the

web search tool return type to your

tools array and give extend your agent

this way. Now we have the same tools

enabled in the same way for entropic as

well as if you look into the Python

provider packages. Now there are two

different types of these native provider

tools. There are tools like web search

that actually run on the provider side.

So there's no actually like requests

doing being done in your application.

The web search entirely happens on the

entropic side of things. But there are

other tools like the computer use tools

that actually are trained to be used by

the model perfectly but the

implementation is happening in your

agent application. So in order to use a

computer use tool, you call the computer

use function and you pass in the

properties that you want to use the tool

with as well as an execute method. That

execute method creates the actual tool

that is being triggered whenever the

model decides to call that tool. And in

ter in case of the computer use tool,

you will get an action that defines

which action the to the model wants to

take. actions like taking a screenshot,

doing a left click or scroll on the

application. Now you are responsible for

yourself to implement that automation

yourself. You can use tool like web.io

or tools like playright to do so.

Now if you look into one basic example,

we have an agent that answers us

arbitrary questions and I'm curious to

know how Bayern Munich played in the

last Champions League game. So what I'm

going to do is I'm going to run that

example without any tools. So the model

probably doesn't know which Champions

League game I'm referring to. So it will

give me a result and an answer something

like I don't have access to real-time

information.

Now if I register the web search tool

that I imported from the entropic

provider package, you can see that if I

run the agent now, the agent is now able

to do the research on the API site. So

with that, the agent will give me actual

information that Bayern Munich won the

last Champions League game against

Spotting Lab. Now I played around with a

bunch of tools and one tool that I found

particularly interesting was the

computer use tool. I maintained a bunch

of world automation tools out there and

I found this interesting because I just

can hook up a web drive.io which is a

popular NodeJS automation framework in

order to automate my browser and I

wanted it to allow me to play

tic-tac-toe against that model. So I

built a little tic-tac-toe game that is

a two-player multi multiplayer game that

allows to play tic-tac-toe against the

component and I wanted the computer use

model or I wanted the entropic model to

use the computer use tool to play

tic-tac-toe against me.

Now I implemented an example of that in

this tic-tac-toe repository and you can

see that I create an agent passing in a

computer use tool a tool that allows the

agent to define when the game has ended

as well as a memory tool. I wanted the

agent to remember things that happened

in the game take notes of that and be

better in the next game and that memory

tool is also a provider native tool to

entropic.

I then registered a context editing

middleware. Every time the model takes

an action like clicking uh in the

browser, it takes a screenshot to know

how the application has updated. Now

this these screenshots are B 64 encoded

screenshots that take a lot of space in

my context window and with just a couple

of turns this context window would

explode and would throw me an error. Uh

so I use the clear tool use addedit

strategy to basically get rid of

previous screenshots that the agent has

taken

and then I invoke the agent to first

review its memory if it has any. It will

store the memories in this memories

directory and then I will start the game

play. The tools I implemented in the

following way. I call the tools.computer

computer use tool which registers the

tool in my agent and then I implement

how the actual automation is happening.

So I'm using web drive io to automate my

browser. I initiate a remote session and

I open the tic tac go to go game in the

browser.

Now once the browser is open you know

for different actions I take different

automations. For a screenshot I just

take a screenshot. Uh if I want to do a

left click, I use the perform actions

interface of the web driver package to

perform an action clicking on a specific

pixel area in the browser. And that you

can do with all the commands. This is

very easy to vibe code away. So you

don't have to spend much time to get

this up and running. Now let's play a

game.

If I run the example, the first thing

that's happened again, it will review

past memories. So I cleared up the

memory directory. So there will be no

memories available. But you will see

once we play the first game, it will

update that memory. So now the first

thing that happens is open up the

browser and I do nothing and the the

model took the first action. It put the

X right in the middle. And so I going to

now play with the model. You can see on

the right side the coordinates that the

model is clicking on. And so it takes a

next turn. And I just gonna play along

until we're done with the game.

I think I dodged the bullet there. I

should have I could have easily won this

game. Now it's We will see if the model

detects that I'm almost about to win

this game. It didn't saw the chance and

I won. So let's see if the model

actually now understands that it did a

big mistake and puts that into its

notes. And surprisingly enough, all

these computer models are able to

understand the tic-tac-toe game, but

they're really bad at playing it. So if

I go back into my IDE, I now say that

the game has ended. And now the model

automatically creates memories about the

game history. It put out puts out the

critical mistakes. We can see here after

took top right, move four, and already

had bottom right, move two, I didn't

recognize the column thread. So it

actually understands the the problems

and mistakes it did. It puts some notes

on component opponent patterns as well

as makes some notes on strategy.

So now it automatically restarts the

game and it reviews these memories

before it starts. So it it's it's a

little bit better now playing this game

in the second turn. It now starts in the

middle. I just go continue with the X in

the same approach. Let's see if the

model does the same mistake again.

Oh, it actually cleared it. It cleared

the game again because it the model

knows it starts with an X in the middle.

Uh that was part of the system prompt in

fact. So the game is restarted and we

were going to play along this time and I

will just do the same interaction again

to see if the model just understands the

threat here again and knows now to to do

a better call on this one. So we see

there we go. It now learned the thread

and I could just continue to play. So

this time the game ends in a draw and we

will see how the model now updates its

memory patterns. It will do that again

before it ends the game. So the game

ended in a draw and it now has updated

its game memory. We have now one loss

and one draw from the agent perspective.

It updated its second game. how it

turned out and how it finished it. There

was apparently no critical mistakes. It

updated its opponent patterns for the

game two as well as updated the strategy

and it recognized that taking the center

first is a strong move. Awesome. That

was fun. It was really interesting to

see how the model started playing

tic-tac-toe really bad, then taking

notes and becoming better and better

over time. I played a bunch of more

games and the last remaining games all

ended in a draw. It's really cool that

OpenAI and Entropic and other providers

provide these native capabilities to the

agent that are now really accessible

through the new tools primitives in

LangchainJS.

We've updated the docs to have you

understand how you can use all these

native provider tools easily in your

agent application. Uh they are available

in our Python packages as well as on our

JavaScript packages. I also uploaded the

tic-tac-toe example fully uh on GitHub.

So you can check it out down below and

play against OpenAI or Entropic

yourself.

I Let an AI Control My Browser to Play Tic-Tac-Toe - LangChainJS Tutorials

LangChain

76 days ago

11:04

AI Framework Development

Rank #1

Description

What if an AI could **actually use the browser** — not through brittle scripts, but by *seeing* the UI and deciding where to click? In this video, I explain how modern **agent tools** work and demonstrate it live by letting a model **play Tic-Tac-Toe in the browser**. The agent takes screenshots, reasons about the UI, and interacts with the page step by step — very different from traditional WebDriver-style automation. I use this demo to unpack: * How AI-driven UI interaction works under the hood * Why **provider-native tools** enable more reliable agent behavior * How this differs from classic, deterministic browser automation * Where this approach makes sense — and where it definitely doesn’t If you’ve built browser automation, testing frameworks, or agentic systems, this should give you a concrete mental model for where things are heading. 🧑‍💻 Tic-Tac-Toe demo repository: [https://github.com/christian-bromann/tictactoe](https://github.com/christian-bromann/tictactoe) 📚 LangChain docs on native tools: OpenAI: https://docs.langchain.com/oss/javascript/integrations/tools/openai Anthropic: https://docs.langchain.com/oss/javascript/integrations/tools/anthropic

Video Details

Category

AI Framework Development

Featured Date

December 17, 2025

Quality Rank

#1

AI Recommended