Are We Turbo Yet? | DailyDevLists

Loading video player...

Full Transcript

4,014 words • EN

[Music]

Okay, thank you everyone. Hello, my name

is Luke Sandberg. I'm a software

engineer at Verscell working on TurboAC.

So, I've been at Versell for about six

months uh

which has given me just enough time to

come up here on stage and tell you about

all the great work I did not do.

Prior to my time at Verscell, I was at

Google where I got to work on our

internal web tool chains and do weird

things like build a TSX to Java bite

code compiler and work on the on the

closure compiler.

So when I arrived at Vcell, it was

actually kind of like stepping on to

another planet like everything was

different and I was pretty surprised by

all the things we did on the team and

the goals and the goals we had. So today

I'm going to share a few of the design

choices we made in TurboAC and how how I

think they will let us continue to build

on the fantastic performance we already

have. So

to help uh motivate that this is our

overall design goal. So from this you

can immediately infer that we probably

made some hard choices. So like what

about cold builds? Those are important,

but you know, one of our ideas is you

shouldn't be experiencing them at all.

And that's what this talk is going to

focus on. In the keynote, you heard a

little bit about how we leverage

incrementality

to improve bundling performance. And the

key idea we have for incrementality is

about caching. We want to make every

single thing the bundler does cachable

so that whenever you make a change, we

only have to redo work related to that

change. Or maybe to put it another way,

the cost of your build should really

scale with the size or complexity of

your change rather than the size or

complexity of your application. And this

is how we can make sure that Turopac

will continue to give developers good

performance uh no matter how many icon

libraries you import.

So uh to help understand and motivate

that idea, let's imagine the world's

simplest bundler, which maybe looks like

this.

So, uh, here's our baby bundler. And

this is maybe a little bit too much code

to put on a slide, but it's going to get

worse. So, here we parse every entry

point. We follow their imports, resolve

their references recursively throughout

the application to find everything you

depend on. Then at the end, we just

simply collect everything each entry

point depends on and plop it into an

output file. So, hooray, we have a baby

bundler. So obviously this is naive but

if we think about it from an incremental

perspective no part of this is

incremental.

So we definitely will parse certain

files multiple times maybe depending on

how many times you import them. That's

terrible. Uh we'll definitely resolve

the react import like hundreds or

thousands of times. Uh so you know ouch.

So if we want this to be at least a

little bit more incremental we need to

find a way to avoid redundant work.

So let's add a cache.

So you might imagine this is our parse

function. It's pretty simple and it's

probably kind of the workhorse of our

bundler. You know, very simple. We read

the file contents, hand them off to SWC

to give us an a. So let's add a cache.

Okay, so this is clearly a nice simple

win. Um, but you know, I'm sure some of

you have written caching code before.

Maybe uh there's some problems here like

you know what if the file changes. This

is clearly something we care about. Um

and you know what if the file isn't

really a file but it's three sim links

in a trench code. A lot of package

managers will organize dependencies like

that.

Um and we're using the file name as a

cache key. Is is that enough? Like you

know we're bundling for the client and

the server. Same files end up in both.

Does that work?

We're also storing the as and returning

it. So now we have to worry about

mutations.

So you know uh and then finally isn't

this a really naive way to parse. Uh I

know that everyone has massive

configurations for their for the

compiler like some of that has to get in

here. So uh yeah these are all great

feedback. uh and uh this is a very na

naive approach and to that of course I

would say yeah this will not work. So

what do we do about fixing these

problems?

Please fix and make no mistakes.

So okay

so maybe this a little bit better. uh

you know you you can see here that we

have some transforms. We need to do

customized things to each file like

maybe down leveling or implement use

cache. We also have some configuration

and so of course we need to like include

that in our key for a cache. But maybe

right away you're suspicious like is

this correct? Like is it actually enough

to identify a transform based on the

name? I don't know. Maybe that has some

complicated configuration all of its

own. and okay and like is this two JSON

value going to actually capture

everything we care about? Will the

developers maintain it? How big will

these cache keys be? How many copies of

the config will we have? So I've

actually personally seen code exactly

like this and I find it next to

impossible to reason about.

Okay, we also tried to fix this other

problem around invalidations.

So we added a callback API to read file.

This is great. So if the file changes,

we can just nuke it from the cache. So

we won't keep serving still contents.

Okay, but this is actually pretty naive

cuz like sure we need to nuke our cache,

but our caller also needs to know that

they need to get a new copy. So okay, so

let's start threading callbacks.

Okay, we did it. We threaded callbacks

up through the stack. You can see here

that we allow our caller to subscribe to

changes.

We can uh just rerun the entire bundle

if anything changes. And if a file

changes, we call it. Great. We have a

reactive bundler.

But this is still hardly incremental. So

if a file changes, we need to walk all

the modules

uh again and uh and produce all the

output files.

So, you know, we saved a bunch of work

by par uh by having our parse cache, but

uh this isn't really enough. And then

finally, there's all this other

redundant work. Like, we definitely want

to cache the imports. We might find a

file a bunch of times and we keep

needing its imports. So, we want to put

a cache there. And you know, resolve

results are actually pretty complicated.

So we should definitely cach that so we

can reuse the work we did resolving

React.

Um but uh okay now we have another

problem. Uh your resolve results change

when you update dependencies or add new

files. So we need another call back

there.

And we definitely also want to like

cache the logic to produce outputs

because you think about in an HMR

session you're editing one part of the

application. So why are we rewriting all

the outputs every time? And oh also you

might like delete an output file. So we

should probably listen to call back uh

listen to changes there too.

Okay. So maybe we solve all those things

but we still have this problem which is

every time anything changes we start

from scratch. So kind of the whole

control flow of this function doesn't

work because if a single file changes,

we'd really kind of want to jump into

the middle of that for loop.

And then finally, our API to our caller

is also hopelessly naive. They probably

actually want to know which files

changed so they can like push updates to

the to the client.

So yeah, so this approach doesn't really

work. And even if we somehow did thread

all the callbacks in all these places,

um do you think you could actually

maintain this code? Do you think you

could like add a new feature to it? Uh I

don't uh I think this would just crash

and burn. Uh and you know to that I

would say yeah.

So once again, what should we do?

uh you know just like when you're

chatting with an LLM, you actually first

need to know what you want. Um and then

you have to be extremely clear about it.

So what do we even want?

So you know uh we considered a lot of

different approaches and many people on

the team actually had a lot of

experience working on bundlers. Um so we

came up with these kind of rough

requirements. So, we definitely want to

be able to cache every expensive

operation in the bundler.

And it should be really easy to do this.

Like, you shouldn't get 15 comments on

your code review every time you add a

new cache.

And um and then I don't actually really

trust developers to write correct cache

keys or track inputs uh or track

dependencies by hand. So, we should

handle uh we should definitely make this

foolproof.

Next, uh, we need to handle changing

inputs. This is like a big idea in HMR,

but even across sessions. So, mostly

this is going to be files, but this

could also be things like config

settings. And with the file system

cache, it actually ends up being things

like environment variables, too. So, we

want to be reactive. We want to be able

to recmputee things um as soon as

anything changes, and we don't want to

thread call backs everywhere.

Uh finally, we just need to take

advantage of modern architectures and be

multi-threaded and just generally fast.

So maybe you're looking at this set of

requirements and some of you are

thinking uh what does this have to do

with a bundler

and to that I would say of course you

know my management team is in the room

so we don't really need to talk about

that.

But really, I'm guessing a lot of you

jump to the much more obvious

conclusion,

this sounds a lot like signals.

And yeah, I am describing a system that

uh like signals. It's a way to compose

computations, track dependencies with

some amount of automatic memoization.

And I should note uh that we you know we

drew inspiration from all sorts of

systems especially the rust compiler and

a system called salsa.

And there's even an academic literature

on these concept called adapons if

you're interested. Okay, so let's take a

look at what the uh let's see what this

looks like in practice and then we're

going to take a very jarring jump from

code samples in JavaScript to Rust.

So here's an example of the

infrastructure we built.

Uh

a turbo function is a cached unit of

work in our compiler.

So we can uh once you annotate a

function like this uh we can track it.

We can construct a cache key out of its

parameters. Um and that allows us to

both cache it and reexecute it when we

need to.

These VC types here you can think of

like signals. This is a reactive value.

VC stands for value cell but um signal

might be a little bit of a better name.

uh when you declare a parameter like

this, you're saying this might change. I

want to I want to re-execute when it

changes. And so how do we know that? So

we read these values via await.

Once you await a reactive value like

this, we automatically track the

dependency.

And then finally, of course, we do the

actual computation we wanted to do and

we store it in a cell. So

because we've automatically tracked

dependencies, we know that this function

depends on both the contents of the file

and the value of the config.

And by and every time we store a new

result into the cell, we can compare it

with the previous one and then uh if

it's changed, we can propagate

notifications to everyone who's read

that value. So this concept of changing

is key to our approach to

incrementality.

Um and yeah again the simplest case is

right here. If the file changes, TurboAC

will observe that it invalidate this

function execution and re-execute it

immediately.

And then if we happen to produce the

same a

uh we'll just stop right there because

we compute the same cell. Now you know

for parsing a file there's hardly any

edit you can make to it that doesn't

actually change the a um but we can

leverage the fundamental composability

of turboac uh functions to take this

further. So here we see another turbopac

cache function

uh extracting imports from a module. Uh

you know you can imagine this is like a

very common task we have in the bundler.

We need to extract imports just to

actually find all the modules in your

application. Uh we leverage them to pick

the best way to group modules uh

together into chunks. And of course the

import graph uh is important to basic

tasks like tree shaking.

Um and so because there's so many

different consumers of the imports data,

a cache makes a lot of sense. So this

implementation isn't really special.

This is like what you would find in any

kind of bundler. We walk the a collect

imports into some special data structure

that we like um and then we return them.

But the key idea here is that we stored

them into another cell.

So if the module changes, we do need to

rerun this function because we read it.

But if you think about the kind of

changes you make to modules, very few of

them actually affect the imports. So you

change the module, you update the

function body, you know, a string

literal,

uh any kind of implementation detail,

it'll invalidate this function and then

we'll compute the same set of imports

and then we uh then we don't invalidate

anything that has read this. So if you

think about this in like an HMR session,

this means that we do need to reparse

your file, but we really don't need to

think about how to do chunking decisions

anymore. We don't need to think about

any kind of tree shaking results because

we know those didn't change. So we can

immediately jump from parsing the file

doing this simple analysis and then

jumping right to producing outputs. And

this is one of the ways we have really

fast um fast refresh times.

So uh this is pretty imperative. Uh

another way to think about this basic

idea is as a graph of nodes.

So here on the left, you might imagine a

cold build. Initially, we actually do

have to read every file, parse them all,

analyze all imports, and as a side

effect of that, we've collected all the

dependency information from your

application.

And then when something changes, we can

leverage that dependency graph we built

up to propagate invalidations back up

the stack and re-execute TurboAC

functions. And so if they produce a new

value, we stop there. Otherwise, we keep

propagating the invalidation.

So

great. You know, this is actually kind

of a massive oversimplification of what

we're doing in practice, you might

imagine. Uh so in TurboAC today, there

are around 2500 different Turboask

functions. And in a typical build, we

might have literally millions of

different tasks.

So it really looks maybe a little bit

more like this.

Now, I don't really expect you to be

able to read this. Couldn't really fit

it on the slide. So maybe we should zoom

out.

[Music]

Okay, so that is not obviously helpful.

In reality, we do have better ways to

kind of track and visualize um what's

happening inside of Turopac, but uh

fundamentally those works by throwing

out the vast majority of dependency

information. Um and now I'm guessing

that some of you maybe actually have

experience working with signals. Uh

maybe bad experiences.

Uh, you know, I for one actually like

stack traces and being able to step into

and out of functions in a debugger. So

maybe you're like suspicious that this

is a complete panacea. Like it obviously

comes with trade-offs.

Um, and yeah, so and to that I would of

course say

well you know what I'd actually say is

all of software engineering is about

managing trade-offs. We're not always

solving problems. Exactly. But we're

really picking new um sets of trade-offs

to deliver value.

So to achieve our design goals around

incremental builds and turboac, we put

kind of all our chips on this

incremental reactive programming model.

Um and this of course had some very

natural

consequences.

So, you know, maybe we actually really

did solve the problem of hand rolled

caching systems and cumbersome

invalidation logic. Um, in exchange, we

have to manage some complicated caching

infrastructure. Um, and of course, you

know, that sounds like a really good

trade-off to me. I I like complicated

caching infrastructure.

Um, but we all have to live with the

consequences.

Um, so the first, of course, is just the

core overheads of this system.

you know, so if you think about it, uh,

in a given build or HMR session,

uh, you're not really changing very

much. So, we track all the dependency

information between like every import

and every resolve result in your

application, but you're only going to

actually like change a few of them. So,

most of the dependency information we

collect is never actually needed.

So, you know, to manage this, uh, we've

had to focus a lot on driving on

improving the performance of this

caching layer, um, to drive the

overheads down and let our system scale

to larger and larger applications.

And the next and most obvious is simply

memory. You know, caches are always

fundamentally a time versus memory

trade-off, and ours is doesn't really do

anything different there. uh our our

simple goal is that the cache size

should scale linearly with uh your the

size of your application but again we

have to be careful about overheads.

Uh this next one is a little subtle. Uh

so we have lots of algorithms in the

bundler as you might expect and some of

them kind of require understanding

something global about your application.

Uh well that's a problem because anytime

you depend on global information it

means any change might invalidate that

operation. So we have to be careful

about how we design these algorithms.

Compose things carefully so that uh we

can preserve incrementality.

And uh finally uh this one's maybe a bit

of a personal gripe. Uh everything is

async in Turboac. And so this is great

for horizontal scalability but once

again it harms our fundamental like you

know debugging performance profiling

goals.

Um,

so, uh, I'm sure a lot of you have

experience debugging async in like the,

uh, in the Chrome dev tools, and this is

generally a pretty nice experience. Not

always ideal, but I assure you, Rust

with LLDB is like light years behind.

Um, so to manage that, we've had to

invest in custom visualization,

instrumentation, and tracing tools. And

look at that, like another

infrastructure project that isn't a

bundler.

Okay, so let's take a look and see if we

made the right bet.

So uh at Verscell, we have a very large

production application. Uh we think it's

maybe one of the largest in the world,

but you know, we don't really know. Uh

but it does have around 80,000 modules

in it. So let's take a look at how

Turopac does on it. for fast refresh.

Um, we really dominate what Webpack is

able to deliver, but this is kind of old

news. Turbopac for Dev has been out for

a while and I really hope everyone is at

least using it in development. But, you

know, the new thing here today, of

course, is that builds are stable. So,

let's look at a build. And here you can

see a substantial win over Webpack for

this application. This particular build

is actually running with our new

experimental file system caching layer.

So about 16 of those 94 seconds is just

flushing the cache out at the end. Uh

and this is something we're going to be

working on improving as file system

caching becomes stable. But of course

the thing about cold builds is that

they're cold. Nothing's incremental. So

let's take a look at a actual warm

build. So using the cache from the cold

build, we can see this. So this is just

a peak at where we are today. Because we

have this fine grain caching system, we

can actually just write out the cache to

disk and then on the next build, read it

back in, figure out what changed and

finish the build. Okay, so this looks

pretty good, but a lot of you are

thinking like, well, I, you know, maybe

I personally don't have the largest

Next.js application in the world.

So, let's take a look at a smaller

example. The React.dev website is quite

a bit smaller. Uh, it's also kind of

interesting because it's a React

compiler. It's unsurprisingly an early

adopter of the React compiler and the

React compiler is implemented in Babel

and this is kind of a problem for our

approach because it means for every file

in the application we need to ask Babel

to process it. So and fundamentally I

would say we or me I I can't make the

React compiler faster. It's not my job.

My job is Turboac.

Uh but we can figure out exactly when to

call it.

So looking at fast refresh times, uh I

was actually a little disappointed with

this result. Uh and it turns out that

about 130 of those 140 milliseconds is

the React compiler. Um and both Turbopac

and Webpack are doing that. But with

Turopac, we can after the React compiler

has processed this change, we can see,

oh, imports didn't change. Chuck it into

the output and keep going.

Once again, on cold builds, we see this

kind of consistent 3x win. And just to

be clear, this is on my machine. Uh but

again, no incrementality on a cold

build. And in a warm build, we see this

much better time. So again, uh with the

warm build, we already have the cache on

disk. All we need to do is basically

once we start, figure out what files in

the application change, re-execute those

jobs, and then reuse everything else

from the previous build. So the basic

question is, are we turbo yet? Yes. So

yeah, this was discussed in the keynote

of course. Turbo pack is stable as of

next 16 and we're even the default

bundler for next. Uh so you know mission

accomplished. You're welcome. Uh but

[Applause]

and if you notice that uh revert revert

revert thing in the keynote, that was me

trying to make Turboac the default. It

only took three tries. Uh but what I

really want to leave you with again is

this uh you know because we're not done.

We still have a lot to do on performance

and finishing the swing on the file

system caching layer. I suggest you all

try it out in dev. And uh that is it.

Thank you so much. Please find me, ask

me questions.

[Music]

Are We Turbo Yet?

Vercel

89 days ago

24:03

Frontend Whitelist

Rank #2

Description

The turbopack team explains how we are making every build incremental and fast. Get a demo today: https://vercel.com/contact/sales/demo

Watch on YouTube

Video Details

Category

Frontend Whitelist

Featured Date

December 6, 2025

Quality Rank

#2

AI Recommended