Loading video player...
All right, so check this out. It all
started on one of those rare, quiet
weekends. My wife and son were out of
town and I had the whole house to
myself. So, what do I do? I decide to
solve a crime, a performance crime.
Let's dive right into this case. You
know, every good detective story needs a
crime scene. And mine, well, it wasn't
some dark, rainy alley. No, it was my
MacBook Pro just groaning under the
weight of an absolutely insane amount of
data. And here's the evidence. Get this.
A single text file, almost 15 GB.
Inside, 1 billion rows of weather data.
The mission sounded simple enough. Read
the file and for each weather station,
figure out the min, mean, and max
temperature. Simple, right? Yeah, not so
much. So, how long did my first, you
know, my simple NodeJS script take? 5
minutes and 49 seconds. In the world of
performance, that's an eternity. I mean,
you could go make a coffee, maybe a
sandwich. I knew right then and there
this thing needed a proper
investigation.
Okay, so my first big clue, my first
hunch, it wasn't some crazy complex
algorithm or anything like that. No, it
was way more fundamental. It was about
strings. I had this nagging feeling that
the real culprit was hiding in plain
sight, something everyone overlooks. So,
here's the deal. The key thing to
understand, the file itself is encoded
in UTF8, right? But JavaScript, it loves
its UTF-16 strings. Now, what does that
mean? It means that for every single one
of those billion lines, my code was
doing this expensive, time-consuming
conversion. My whole theory was, what if
I could just skip that? What if I could
work directly with the raw bytes? That
had to be a huge win. And this right
here, this was the trick. Instead of
parsing a string like 12.3 and dealing
with messy, slow floatingoint math, I
just treated the whole thing as an
integer. I'd read each number bite, turn
it into an integer, and just multiply by
10 as I went along. So 12.3 just became
123. Boom. No strings, no floats, just
pure lightning fast integer math. And
the results, oh man, they were good. I
ran it on a smaller test file, just 10
million rows. The old way, three and a
half seconds. My new bite parsing trick,
1.8 seconds. We're talking almost 50%
faster. But a good detective knows you
don't stop at the first clue. The case
was far from closed. So that first win
felt amazing. But I just knew there was
more speed to be found. The big problem
was I was still converting all the
station names into strings just so I
could use them as keys in a map. And let
me tell you, trying to solve that
problem led me down some seriously dark
alleys. Man, I tried everything. I
looked into string entering. Nope. Dead
end. I tried using raw buffer slices as
keys. Total disaster. Just corrupted all
my data because of how JavaScript maps
work. I even went so far as to build my
own custom hashmap from scratch. Hours
and hours of work. And you want to know
what happened with my brilliant
customuilt hashmap? It was slower.
Slower than the original code. It really
just goes to show you sometimes the
murder weapon is your own cleverness.
And then it was like 2:00 a.m. I was
probably half asleep and bam, it hit me.
The real breakthrough. You know those
moments where everything just clicks and
the whole case snaps into perfect focus?
The solution was just so simple, so
elegant. Instead of turning the station
name into a string, I just computed a
32bit hash directly from its raw buffer.
I used that integer hash as the key in
my map. Think about that. I didn't
create a single new string inside the
main processing loop. The decoding, I
pushed it all the way to the very end
after every single one of those billion
rows was done. So, the final time for
the full 14 GB file, 1 minute and 14
seconds. We went from almost 6 minutes
down to just over one. The case was
pretty much soft, but I had to tie up
one last loose end. I had to be sure.
Before I could officially close the book
on this thing, I had to rule out any
other suspects. I mean, what if it
wasn't my code at all? What if NodeJS
itself was the real bottleneck? Was I
just blaming myself when the runtime was
the real culprit all along? So, I took
the exact same optimized code and ran it
on Bun and Dino. And what's super
interesting here is just how close they
all are. Yeah, Bun was a little faster,
just over a minute, but look at that.
They're all in the same ballpark. We're
talking seconds of difference. And that
that was all the proof I needed. The
bottleneck was never the tool. It wasn't
Noode.js. It was my code. It was the
algorithm. And this is why you guys, you
have to be skeptical of those generic
benchmarks you see online. Your specific
problem is what really matters. And for
my final piece of evidence, I brought
out the big guns, the profiler. And this
right here, this was the smoking gun.
one single function process chunk was
eating up 87% of all the time. It wasn't
a memory problem. It wasn't a file IO
problem. It was pure raw CPU bottleneck.
My hot path was finally as clean as it
was ever going to get. So, what did this
whole investigation teach me? Man, it
was a total masterass in hunting down
bottlenecks. Let's just quickly recap
the evidence. 78% that's the final
number. a 78% reduction in processing
time. We went from a slow, painful crawl
to an absolute sprint. And it was all
because we thought like a detective and
focused on the real evidence. So, let's
break it down. We won by getting rid of
all that string overhead, by using way
faster integer math, and by being clever
about when we actually did the work
using hashing. But the biggest takeaway
for you is this. Optimization is
detective work. You have to come up with
theories. You have to test them. And you
can't be afraid to chase down a few dead
ends to find the real culprit. Okay, so
we made the code 4.7 times faster on a
single core. But remember that profiler?
It told us we are completely CPUbound.
One detective, no matter how good, can
only move so fast. So what's the next
logical step to go even faster? The next
case, it's all about calling for backup.
The next big win isn't going to come
from more clever bite level tricks. It's
going to come from parallelization,
splitting up the work across all the
cores on my machine using no.js worker
threads. We're putting a whole team of
detectives on this case. And the best
part, I'm already deep into that SQL.
And let me tell you, the results are
looking even more dramatic. We're
talking about maybe, just maybe, cutting
this time in half. Again, this is where
we get into the really advanced stuff.
You know, so if you want to see how we
crack the case of multi-threaded
performance in Node.js, you've got to
subscribe to the channel.
Learn how to optimize Node.js for processing large files: 14GB of data processed 78% faster using buffer streaming, byte-level parsing, and hash-based lookups. Complete guide with benchmarks, profiling insights, and code examples for handling 1 billion rows efficiently. --- This is the video remake of my popular blog post about how I got Node.js to process 14GB Files 78% Faster with Buffer Optimization. The blog post has a lot more details and you can read about it at https://pmbanugo.me/blog/nodejs-1brc