Claude Code Just Built Me My Own NotebookLM | DailyDevLists

Loading video player...

Full Transcript

8,566 words • EN

A lot of people love Notebook LM, but if

you've watched this channel for a while,

then you'll notice that I've never

covered it. Although it's a cool tool,

as an AI power user, I always have this

need to tinker and customize everything

behind the curtain. And I always like to

be able to tap into the knowledge or

resources of whatever platform I'm

building on from APIs, MCP servers, so I

can get more leverage in other

applications. So instead of waiting for

the notebook LM team to build their

product exactly the way that I wanted

it, I took matters into my own hands.

[music] With the help of Cloud Code, I

made my own notebook LM but on steroids.

Meaning I was able to replicate every

existing functionality that already is

out there on top of that be able to add

things like an MCP server, an API, and a

lot of other hidden features that people

have been asking for for months. But let

me tell you, this was not

straightforward. So, in this video, I'm

going to walk you through my version of

the tool, the exact design thinking

principles I applied, the architecture

decisions, and everything that you need

to see to be able to replicate this on

your own. And even if you're not

technical, trust me, you're going to

want to watch this whole video. It's on

the longer side, but I can guarantee

that you're going to learn a ton. So,

with that, let's get into it. So, this

is my version of Notebook LM Reimagined.

And if you can see the home screen here,

you have the ability to go back to

pre-existing notebooks. So, just like

you would with the normal notebook LM,

you can go to create a new notebook

right here. Add some form of emoji. I'll

call this demo. And we'll click on

create notebook. And then this will

spawn, as you will be familiar with, all

the features that you like on top of

some new ones. So, you have the ability

to create a multi-person podcast, one of

the more famous features of Notebook LM,

the ability to create videos, to do deep

research, to upload sources, and have

those sources autorag. and then from

those sources create different

flashcards, quizzes, all the things that

you love and know about Notebookm if

you've used the product. But instead of

me yapping about it, let me show you. So

let's go to one of our existing ones

called AI research notes. So you'll see

here if I drag this over, this has a

text about machine learning. So if I go

to something like the audio podcast and

I go to brief summary and I click on

something like generate here, this will

go and generate audio. I'll walk you

through exactly what it's generating

from where shortly, but this will take

around 10 to 15 seconds for the brief

version and come back with the famous

multi-person podcast. And when we get

back a response, we get something like

this where you can see a script back and

forth between Alex and Sam. And if we

click on this, you should be able to

hear the back and forth conversation.

>> Hey Sam, welcome back to the podcast.

Today, I wanted to dive into a term we

hear all the time, but maybe don't fully

get. machine learning.

>> Oh, perfect. I feel like it's

everywhere. My streaming service uses

it. My phone uses it.

>> You get the idea. So, the audio podcast

works all fine. On the video side, you

can generate an explainer video,

documentary, presentation. Here's the

history where you can pull up prior

videos. I'll just mute this right here.

Or yeah, let me give that a shot. I'll

just put this on no volume. You can see

generate small micro videos to longer

videos. And then just like the rest, you

have the ability to do deep research,

fast research using Gemini's actual

APIs. If we go to view right here,

you'll be able to see prior versions of

the research such as this very

sophisticated query I sent. Why is

machine learning so cool? Comes back

comes back with sources. There's fast

mode. There's deep mode. On the study

materials, this section took quite a

while, but you can click on flashcards,

quiz, study guide. If I click on flash

cards, it will ask me do you want basic,

intermediate or advanced? Any specific

focus area number of items. So I can say

like give me 15. Let's do the intro to

machine learning course. We'll click on

generate. It will go and actually put

together the flash cards. And when it's

ready, it will pop it up at the left

hand side or the right hand side of the

screen. And you'll see I can just go

through all of my cards. I can expand

the screen. I can download them. I can

use keyboard shortcuts to go through all

the functionality that you'd want. And

then if you want a mind map, then we can

upload even more resources here if we

want. And then when we generate the mind

map again, we can customize it to our

needs. When it's ready to go, it will

use behind the scenes basically mermaid

to allow us to pop this up, zoom in,

look at all the relationships, and very

similar to what you'd see elsewhere. And

then you'd go to let's say quiz put this

together. If you say basic and click on

generate and you get an interactive quiz

that you can go through. Go to the next

question. I'm going to yolo this

response. Okay. Clearly I'm incompetent.

So I will go to FAQ now. Click on

generate. This will also seemingly

[snorts] work the same way. This will

pop up on the right hand side of the

screen. And you get the idea. So you'll

see all these FAQs generated. And the

beauty is everything's really fast. And

the reason why it's fast is the way we

built it. Just to end off things like

data tables, reports, slide decks. If I

click on slide deck, this will not just

create a normal slide by slide

breakdown, but it will also allow you to

download it as a PowerPoint file. And

you can see right here we have slide by

slide, agendas, and bullets. And if we

want, we can click on download as

PowerPoint. This right now the way I've

configured it is not to be overly

beautiful but it works and it's well

structured and it gives you the

foundation of actually building all

this. So it even renders the markdown.

And beyond that we do have things like

infographics which will use the nano

banana API. We can pick an infographic

or digital illustration style. Let's say

yeah let's do digital illustration.

We'll do colorful colorful and bold.

We'll click on generate and then this

will ping basically the API responsible

for generating images and it will come

back with the payload. And there we go.

We get not one but four different images

that we can download at the same time,

expand, go through one by one and it's

very colored. I definitely didn't know

it was going to be this colorful, but it

works. But beyond that, there's more.

Obviously, I don't want you to think I'm

full of it. So, let's actually send over

a request to see how it works. If I do

something like summarize the key points,

this will take whatever's in the sources

and come back with a response grounded

in what those sources say. So you'll see

right here it says machine learning is a

subset of AI that allows systems to

learn and improve from experience. It

has the direct sources. You can click on

the sources, go to the original source,

and if there is a specific snippet, it

should show it here. You can always go

back to different chat histories to go

through your conversations or do

something that's harder to do on the

normal Notebook LM, which is downloading

it in whatever way you want. So, if you

want to download the entire set of

assets, chats, etc. from your personal

notebook LM, then this will export it as

a zip, and you'll be able to navigate

through all the resources. And you can

see right here, it's a 10 megabyte file.

It takes through all the history, all

the images, everything that you've done.

And the best part is you control

everything. But wait again, there's

more. There's much more. You have the

ability to go to settings and then add a

system behavior where notebook can take

the persona of a simple explainer, a

critical reviewer. You can have

different preferences on response

length, uh, tone, inline style, etc. And

then the part that I really built this

for, which is the ability to go to your

home screen, go to your settings, and

then go to the API request builder. Then

you can pick whatever operation you

want. Whether it's chatting directly

with the notebook via API, listing the

sources, generating the multi-person

podcast audio externally, generating the

video, deep research, flashcards,

everything you can imagine. And the cool

part is if we go into any this is an

example of me pinging this exact same

resource asking the following question.

So the question I asked here is what are

the key insights from this document? I

can just change this to say what is this

document

about and then we will execute this

step. This will ping our API that's

actually hosted on a real server and

come back with a response. Is it machine

learning? Yes, it is. It says this

document provides an intro to machine

learning. So there it works and you can

see the citations right here. You can

see the suggested questions that come

after the input tokens as well as the

model used, which in this case is the

Gemini 2.5 flash model. You can use

whatever you want. I just happen to make

certain decisions based on familiarity

with the response patterns of certain

models versus others. And if you want to

use this anywhere else, I made it so

that it creates an HTTP request for

NADN, for make, for Zapier, and all you

have to do is crank out your own API key

for your own service and then hit it

from wherever you want. And for whatever

reason, if you want to be able to share

the same infrastructure with your team,

a colleague, a partner, this is an

accountbased system, meaning you can log

in, log out, create different accounts,

and each account will persist all the

notebooks and assets of those notebooks

within that account. And lastly, the

most important feature, the theme. Right

now, it's on the boring purple, but you

can go to my favorite midnight blue or

the cool crimson. change your

experience, change your environment, and

go from there. So, now that I've proven

to you that I've recreated and added on

top of an existing platform, all from

scratch using Cloud Code, let's get down

to brass tax as to how I actually put it

together. So, to bring this to reality

with Claude Code, all we had to use was

Superbase, Versel, the Gemini API, and I

added one more API on top of it for the

video gen because the Veo 3 models from

Gemini are eyewateringly expensive. so

expensive that I accidentally spent a

hundred bucks just trying to iterate and

build the app. But theoretically, I

designed it this way because all of

these tools have a fairly generous free

tier if you just want to take the

barebones cloud code plan, the barebones

superbase plan, and the barebones versel

and Gemini pay as you go models.

Theoretically, you could make this GPDR

sock 2 compliant by swapping out all the

services with, let's say, Amazon Bedrock

to use the cloud models or to use Gemini

on Vertex on their cloud. Or you could

switch it up to run on local models if

you want. And like you saw, there are

three core ways to interact with my

version of the app, which is not just

the web app that you know and love, but

the API as well as anywhere else like

Nitn or Zapier. And just in case you're

not imagining big enough, up until now,

doing rag and naden has been possible. A

lot of people use Superbase, the Gemini

file search. This is probably one of the

most convenient and cool ways to have a

rag externally since it's all done at

the browser level. Now, the TLDDR of the

app is that it's built on what's called

Nex.js and it sends requests to Versel,

which is not only hosting the actual

platform itself, but on top of that,

it's hosting all of the different API

endpoints. And if you don't know what an

API endpoint is, it stands for an

application programming interface, which

means that you can just hit the back end

of a service. So in case that doesn't

really register, whenever you use

something like Gemini or Chat GBT on the

front end and you say make me an image

or make me a video of a monkey running,

that behind the scenes is calling

different tools and services. Those

services are the video generators, the

image generators, the audio generators,

etc. And then when you make that

request, it goes and does the dirty work

behind the curtains and brings you back

the result. So what makes builds like

this powerful and all kinds of things

that you can do in cloud code is now we

live in a world where everything is

modular. You can have the foundation

built the way you want. Then you can

port in whatever services that you want

that you feel are best for whatever

you're trying to do. And to store and

support the app, we have Superbase

storing everything in the database. We

have Gemini responsible for the majority

of the rag since it's quick, it's

efficient, and most importantly, it's

cheap. And then when it comes to storing

everything, so you can download

everything at once and export your

entire notebook. This is all being

stored at the Superbase level in

Superbase file storage. So diving deeper

in the nitty-gritty, we have cloud code

where we're building basically our

version like we said of notebook LM. We

have Gemini texttospech API for the back

and forth podcast. We're using the

Alibaba video model. I use this one

because it's 10 times cheaper, if not

more, than the Gemini 3.1 latest model.

We have, like I said before, Superbase,

which is amazing as a vector database

for authentication. And then we're using

Versel, like I said, for deployment. And

then we're using fast API. This is a

framework that lets you create these

services that I referred to before that

you can hit from any other application.

And if you're a bit more curious on

this, give me a second. Now, the app as

a whole has 50 API endpoints. And these

endpoints include the ability to talk

to, create, list all the notebooks, the

sources, go directly to chat with an

existing notebook, and create all the

assets that I showed you before. And

obviously, I'm not the one who built

this myself. I'm the one who told

Claude, "Okay, let's build the

functionality. Let's use this service,

and then let's create a way that I could

access said service or create said asset

from said service externally." And if we

go back to the app, you'll see that at

the right hand side of the settings tab,

I created this API documentation tab

where you can click on everything from

chat to global chat to audio overviews,

video overviews, the study materials. It

will show you each and every way that

you can interact with this service. And

in case you're not technical, the

biggest TLDDR of TLDDRs, when you see

the word post here, this just means that

you're sending something to the service

and you want to get back some form of

response. A get basically means that you

are trying to retrieve what's available.

So if you're asking what notebooks do I

have, what are their names? This is

where doing a get request would make

sense. Patching is really for updating,

which you would rarely use, and deletion

is deleting. What happens when you send

a query or request? Well, behind the

scenes, you ask question. Your question

goes to the back end that I've now shown

you. This goes through, sees the

research papers, any YouTube videos,

anything you've added to the sources. It

sends the context and the question to

Gemini to the file search API and it

comes back with a response with

citations. Now, to get to the point

where you can actually ask the question,

you obviously have to upload a file. So

when you upload a file, it goes to

superbase storage and then it gets

synced and sent to the Gemini files API.

This basically does rag on the fly. Now

you can use whatever rag you want, but

for my purposes since I'm already using

Gemini in quite a few areas, I just use

the file search API as well. And

primarily because I'm lazy because that

API already supports things like PDFs,

DOCX files, text files. So I didn't want

to have to go the next natural step and

teach Superbase exactly how to handle

those files and how to rag them. I want

to take the path of least resistance. So

what happens is when the user sends a

query, that query goes and pings the

Gemini file search API which has its own

versions of these notebooks. It then

goes through the chunks associated with

those notebooks to find the most

semantically similar the closest match

to the vector coming in. So the query

how does X work turns into a vector.

That vector is then sent to the file

search API to look for the closest

matching vectors and then you get the

response along with the citations

associated with said response. And the

key trick here is because notebook LM

allows you to ask a question to all the

sources in the notebook, we had to make

sure that when we send a question from

the user interface, it goes and queries

and checks all the knowledge sources in

that UI. And this is really where the

devil in the details come in where

having something like Claude Code as

your companion can help push you through

all of these conceptual barriers. Now,

how did I make the audio overview

podcasting work? Am I a prodigal genius?

Absolutely not. What I did do is take

advantage of the fact that Gemini has a

texttospech API that allows for

multisspeaker. So step one is when we

send the request, it looks at the

sources that we have. It then injects

the context of that source and creates a

script with Gemini 2.5 Pro. You can use

whatever you want. It then creates the

text to speech in multisspeaker mode

using Gemini text to speech which I

think also uses 2.5 as the base and then

it creates a MP3 audio file that we

render on the user interface itself. And

like we said before we have a deep dive

version, a brief summary version and

then we have a debate where they just go

at it for the video. Even using the

Alibaba model at scale, especially if

you want to do like a 10-second video or

five of them, it'll still cost you

three, four, five bucks, but it's way

cheaper than Veo, which will cost you an

arm and a leg. So, the way this one

works is, again, we use 2.5 Pro to

create the description of the scenes.

Now, it's 5 seconds or 10 seconds, so we

don't have too many scenes here. And

then it goes and sends it to the Atlas

cloud. That's where the Juan 2.5 model

from Alibaba is hosted. Again, you can

swap this out for whatever API you want.

All you'd have to do is give the

documentation of that API, throw it into

cloud, say go and swap one for this

other one, pun intended, and then go

from there. And then unlike before we

had a MP3, this result output would be

an MP4. And to render it on screen, this

is why this is stored in superbase

storage and allows you to share it as

well. On the deep research side, like we

said before, you could send whatever

query you want. So you could say, who is

Mark Kashef? And I could say research.

This will go and do its thing. And

behind the scenes, this is what it's

doing. It has the Gemini 2.0 flash fast

mode. You could update this to the

latest model, Gemini 3. Whatever you

want. This is just super cheap, and I

wanted to prove out the concept more so

than worry about the quality, per se, or

you could use the Gemini 2.5 Pro deep

research mode, which is marketkedly more

expensive. So, be careful with that one,

too. Behind the scenes, do you as the

user send a query? That query is sent to

the backend. That backend then sends it

to the Gemini API to generate a report,

generate citations associated with that

and pull that from the API and then pull

that over to Subbabase and then display

that on your user interface at the end.

And this is where you see the response

back in chat when you come back with a

response. Now the study materials, how

did we take care of these? So the

flashcards, the quiz, all of these I had

to target one functionality at a time.

That's the key thing with vibe coding

that a lot of people miss, especially

when they talk about all these loosey

goosey frameworks to oneshot a whole

app. You don't oneshot this. You build

this incrementally, one feature at a

time. As you build each feature,

technically each feature is a separate

chat because each one needed a little

back and forth to perfect the way it

behaves, the way it works, which API

it's using, how fast it was, how it

rendered on screen. These are all

details that matter. Which is why when

people go on X or YouTube and say, "I

can build this whole thing in like 10

minutes with one mega prompt and run

Claude autonomously on its own." Odds

are you won't get the level of detail or

output or quality that you're looking

for. So for the study materials

generation, we're using 2.0 Flash across

the board. And starting off with flash

cards, by default, we're generating 10

flash cards. And once we have all the

content and the raw content, assuming

that there's a custom AI personality, it

will take that into account, which is

why this whole interoperability of the

app is important. If we set a notebook

setting for it to be more authoritative

or simple, that should be injected in

most of the features outputs. So then if

there is a custom personality, that's

injected into the prompt. And then we

get the flash cards and then usage

stats. Basically if you pass or fail

guessing it, it will document that and

then store that in superbase. So all

these functionalities here are basically

doing the same thing where it injects

all the context in memory within the

context of creating the flashcards or

the quiz and it sends it to different

system prompts. The one for quiz will

say go and make x amount of questions

and answers based on all this material.

It will then inject all that material

inside. Meaning there is an upper bound

the way I built it for how many sources

you can add because Gemini as of today

has a million context window. If you

have I don't know 500 pages or 700 pages

you will start to hit the limit of said

context. Now are there ways around that

especially engineering ways? Absolutely.

I wanted to build a foundation and give

you the foundation so you can do

whatever you want with it. Now the

second last set of features are the

creative outputs which is again the data

table export reports slide decks and

infographics. And the way this works

behind the scenes is you have the core

sources and then you have the Gemini API

and then we are always outputting the

source output as a JSON and then from

that JSON we transform it to whatever

the end state needs to be. This applies

to the PowerPoint and the report

primarily because those need to be

transformed from JSON into a PowerPoint

file and a DOCX file. The infographic

from Gemini comes back as a JPEG which

we render on screen as a PNG. So you

don't really have to worry about that.

And the data table renders as an Excel

CSV file. So phase 2 is taking that data

and I chose to use free JavaScript

libraries to create DOCX files in

PowerPoint. But here's where you could

add your own flavor to use a cloud skill

like the PowerPoint skill or the Excel

skill and use the API associated with

that to be able to really make them

beautiful and powerful. And last but not

least, once it's stored in Subbase, we

want to make sure that we can actually

autod download it from the browser

itself. And if you're non-technical,

this is what's happening behind the

screen. User clicks generate. This

creates the slide deck that comes back

as title, sections, and slides in the

JSON itself, just data. And then we

store that in superbase and then we go

and create the PowerPoint files and docx

files and CSV files from that as a

result in a format that you can download

onto your computer. As you can see here,

there are multiple passes to go from

click to output. And last but not least,

when you want to configure the settings

of the notebook to be simple explainer,

critical reviewer, you want to set the

preferences, a lot of this is prompt

manipulation. So if you go to this part

of the screen, no persona is just

business as usual. Critical reviewer

questions assumptions, finds weaknesses,

basically pushes back on you. Simple

explainer, self-explanatory. Technical

expert is going to be more on the

technical side of things. Creative

thinker probably what I would choose.

And then custom is where you can write

your own custom instructions that apply

to the response that you get back from

the Gemini file API. So the output

preferences once again are pass through

parameters in the system prompt. So when

we go to here anytime you trigger an API

call this will build the persona

instructions and then it will inject the

persona instructions within any API call

we make to Gemini so that when we ask a

question we're always injecting this in

the payload of the API call to that

service. So so long as these custom

instructions persist this keeps getting

injected. So, that's an overview of the

app from an architecture standpoint.

Now, I'll go through an example of what

a PRD or a product requirements document

might look like to be able to accomplish

and build this for yourself. And on top

of that, I'll walk through some starter

prompts for each type of functionality

that you might want to be able to think

about. And I'll make available to you

along with all of this, a whole care

package in the second link in the

description below. Now, to make this as

straightforward as possible for you to

take some of these files and recreate

the whole process and really make it

your own, what I did is initially to

create the V1, V2, V5000 of this

project, I went to Perplexity Pro,

specifically the labs feature. I like

labs because you can ask it to go look

for the latest documentation on a series

of different software or APIs and then

tell it optimize and create a markdown

file that is also optimized for the

latest version of Claude Code. So while

it goes and researches the requirements

of the APIs and services that you need,

it's all grounded also in the latest

version of cloud code. So it takes that

into account when putting together that

prompt. But naturally, I had to make

many iterations to get to the point of

the demo that I've been showing you. So

what I asked it to do is a bit of a

retroactive exercise, which I honestly

recommend for all of you, whatever it is

you end up building. So, it's one thing

to write an initial plan, but it's

another thing to finish exactly what you

set out to finish, and then you go back

and tell Claude Code something along

these lines. Go through all the code

that you've put together and all the

features that we've implemented ever

since this initial plan. And then you

can tag your initial plan. And the best

part of this is that over time as you

build different projects, every single

time you do this mega reverse prompting

on the project, you not only learn what

you could have done better in the

planning, but even if you're not

technical, you'll start to learn these

concepts by osmosis. So, if we take a

look at this vision document, obviously

there's a lot here, but you'll notice

the way it's designed is it's using

something called asi art. And as art is

right here where it creates these tables

and these diagrams. So instead of

actually having a mermaid diagram, this

is written in markdown. So Claude can

actually read this by the way. So if you

have a way to communicate how a system

should flow, this in my opinion is one

of the mega hacks that you can do to

convey that to the AI. So in here we

start off saying for Claude, this is how

to use this document. This document is a

three-part specification for building

notebook LM reimagined. Before starting,

ask the user to set up these

prerequisites. So I made this for you so

that if you are less technical this will

ideally hold your hand and tell you okay

go and make a superbase project go and

get the authentication token so you can

use the superbase MCP to make life

easier. Go and set up Versell because

Verscell is where you're going to host

the platform as well as all the API

endpoints that we'll hit externally if

you want to be able to use it from

Nenm.com or Zapier. I also tell it read

the documents in this order. So, first

the vision document, two the project

spec, and three the implementation

guide. Now, one thing that's heavily

underrated is talking to Claude like an

actual human. So, it's one thing to give

it specs. I see people just give it a

grocery list of things to build over

time, but it's a completely other thing

to tell it this is why I want you to

build this, like this is the inspiration

behind it. If you can explain the

context very deep into a project, you'll

be surprised. it might do something

thoughtful that you didn't expect

because it understands the core

foundation of the direction and where

you want to go. So it's one thing to say

I want to build a clone of this app

where I can autorag a series of sources

and then do all these following features

versus say I wish that I could give this

product the ability to interact with it

external from just the browserbased

platform. So here we go through the

design principles. So API first always

meaning every single functionality that

you make make sure that you're thinking

at the back of your head how you can

make this accessible externally and then

in this case you can use whatever

database you want for a lot of vibe

coders I usually recommend superbase

because it's MCP is good enough when

you're building the MVP that it can take

control take control of building all the

tables adjusting the tables and building

the edge functions that you need and

then as we go through this is the

architecture overview so I'm giving an

example of how we want to be able to

interact with the back end. And this is

the series of gaps that it filled. So

these are all the API endpoints that

it's going through and it tells you API

notebooks sources chat audio video.

So this makes it abundantly clear

exactly what the end state of this API

should look like. And then you have the

services we'll use, what we're building,

and all the features and sub features.

And then I tell it what we don't want.

So in this case, I didn't want to build

this to run locally. I wanted to build

it so I could run it in the cloud. But

you could totally make a version of this

that's like an open notebook LM where

you could make it run on everything

that's on your desktop. And then it goes

through what the user experience should

look like from a developer standpoint

for no code users for Gemini different

model references. So here's where we're

mapping all the models to the features.

So you'll see right here, if we want a

fast chat, then we'll use Gemini 2.0

Flash. You could use three, you could

use 2.5, whatever you want. For text to

speech, we're telling it to use this

specific model for typical operation

costs. This was helped quite a bit

initially with perplexity labs because

it went and searched all the cost of

these services. The philosophy, so this

is what notebookm is and then notebookm

reimagined is here's our platform, build

whatever you want. So basically build a

solid ground that's modular enough that

any one of you can hook up whatever

services or swap whatever services that

you're looking for. And then we go

through and we tell it next steps and

this is also underrated at the end of

the document to tell it what is the next

step since there's some recency bias

there. So this will exit the first

document and then let's remove this go

to this project spec. This will go

through a table of contents through what

the MCP setup should look like. How we

should use AI studio in terms of the

APIs and all the models that we need to

use. The Verscell setup. Again, I use

the Versell MCP to make my life a little

bit easier. You go, you create the

account, you hook up the MCP with one

line. If you don't know how to do the

oneline setup, all you can do is go to

something like a perplexity and say, can

you give me the onelined installation

command that I can use to install the

versel MCP and then give me any

extraneous links that I would need to go

grab whatever tokens or whatever

parameters I need to fill in that one

line. So, we can make it a oneshot

operation. So this tells whatever

research platform you want to go and

look through the documentation to come

back with that command and ideally it

should write in caps what you need to

swap out. So these are nine times out of

10 actual API keys or tokens. So you can

see here it tells you to grab your uh

versel token and it tells you this is

the command that you paste. It would be

this whole thing right here and then you

would sub the token from here that we

grab. and it tells you what you need to

fill in, where you can fill it in from,

and the format you need to fill it in,

and then you should be good to go from

there. So, you copy that once you have

those credentials. You put it in your

cloud code. It won't work right away.

You'll have to restart cloud code or put

a brand new session in place, then do

/mcp, and then make sure that it's

there. And beyond that, we get a glimpse

of the system architecture, the

directory structure. This is really

advanced. This basically is telling it

how to organize everything. Again, much

easier to be retroactive than proactive.

But once you see this, again, especially

if you're not a technical person, you

can start to see the logic of how the AI

likes to categorize different, in this

case, Python scripts, which are a proxy,

a direct proxy for functionality. So,

when it comes to routers, you'll see

that all of these endpoints like

creating the notebook, the sources, the

chat audio, all of them are under this

routers folder. So, you can start to see

how even the front end is organized. So

even if you have never been able to

appreciate this as a non-technical

person, you can start to really learn

from how this is applied. These next

section tell it how to interact with the

superbase MCP. You can override these

features. This is the way that I built

my table. You can make it your own. And

then the rest of this are a series of

specs that you can read because again

I'll be giving this to you. And last but

not least, we have the implementation

guide which basically tells it what the

prerequisites are if you want to get

started. I made it so that it encourages

you to use the Superbase MCP and the

Versell MCP if you want to use different

services. Completely up to your heart's

desire. And then asks you for a Gemini

API key. And lastly, it asks you if

you're interested in video, if you want

to go to Atlas Cloud since it has a much

cheaper video model. It's not as good as

Gemini Veo, but infinitely cheaper. You

can swap in whatever you want. And then

this goes through again a series of

specs for all the superbase stuff that

needs to be built. the implementation

order. Again, if you are a developer and

you understand what you're looking for,

this is where you'd want to either

ignore this, remove this, swap it in for

what you think would be useful, then the

rest of this gives the rest of the

foundation it needs to do what it needs

to do. And most importantly, it gives it

this checklist. Because the context

window of cloud code maxes out so often,

especially when you're using an MCP

where it's taking tons of tokens to

write tables, get feedback, and search,

it's good to have this checklist where

it's not just a checklist for

checklist's sake. You make it literally

check off everything it finishes. And

this is helpful because even as you

compact conversations, if it's checked

off phase one and a part of phase three,

like up until here, you'll be able to

tell it go and refer back to the

implementation guide and pick up where

you left off. And then it should see

that these are the remaining ones for

phase 2 and then go from there. So now

that we've seen the demo, we have a

decent understanding high level of how

everything works. We've seen the project

requirements documents that you'd need

to go on this journey. What are some

good best practices for prompting? Once

again, I got you covered. So all you'd

have to do is theoretically take those

three files, the implementation guide,

the vision document, and the other one.

Take that and put it into a blank slate

brand new cloud code folder. And you

could use this in whatever ID of choice.

You could use cursor, you could use

anti-gravity. And then you would do

slashinitialize. Slash initialize will

just push claude to read those specs and

those guides and create a summary to

itself called claude MD. Once we have

cloudmd, then you can run a prompt

similar to this where you can say

execute the implementation plan in

cloudmd from start to finish. You have

full autonomy to set up superbase with

the uh superbase mcp. Now, I tell it

this because Claude will usually warn

you. If you're already paying for

Superbase, then it will say this will

cost you 10 bucks. If you're a brand new

user, it's one of your first couple

databases. I believe it is free. They

have a decently generous free tier. And

then it tells it to create the front-end

scaffolding. Basically, build the

framing of the house before you put in

the furniture and build everything else.

Build the foundation, implement the core

features, test by interacting with local

host. One thing you can also say is now

that we have an updated version of cloud

code where it can use claude in Chrome,

meaning it uses this extension, you'll

see right here to interact with local

host, you can make it go and check its

own work. So essentially, you let it run

for a more autonomous period of time to

go through look at your specs, compare

what it sees on screen, and this will

save you a lot of back and forth. And

one very important thing that most

people don't care about, especially if

you're not a dev, is tell it to commit

progress incrementally. Ideally, it

shouldn't take you too much time,

install GitHub, make an account, install

the app on cloud so that you can create

a project, and incrementally keep

committing things as you progress

because things will break. And sometimes

you'll get to a point where 80% of the

stuff you want is there, but it's been

built wrong. So, the last 20% isn't even

possible. So the more that you have

checkpoints, the higher the likelihood

that you'll be able to rewind, go back

in time, and then build the right way.

For the database and backend, you can

say design and implement the database

schema for insert X feature. And these

are the requirements. Create tables with

proper foreign keys and indexes. If you

don't know what these words mean in

plain English, a foreign key allows you

to create relationships between

different tables. Because if you have

one table for the notebook and one table

for the audio related to something you

created for that notebook, ideally there

should be some relationship between them

so that they if you have an API call

you're trying to make if you're trying

to marry them in some way that it is

possible. The rest of these are slightly

more technical but again this will be

all available to you so you can read

through it after. And then next are

feature requests. So let's say you want

to add newer features and the ones that

I've added myself. You could say

implement feature name with the

following requirements and you could

write a user story. So this is where you

can take on the hat as a product person

and say as a user I want to do X action

so that you get Y benefit. Again the

more human context you can give the

higher the likelihood they will do a

good job. Another thing is if you can

open brand new sessions for each feature

like I said before ideally each session

should be one core feature of the app.

You shouldn't set up authentication and

then go and make the UI pretty and then

go and add audio mode all in the same

chat. The context will bleed, things

will mix, things won't work, or things

will take infinitely longer than they

have to. Now, UI-wise, you can bring in

Gemini as a second set of eyes because

it's better at UI. For me, I just brute

forced Claude code to become better. I

would still say it's not beautiful. Uh,

you could continue and make it better,

but you can say this component feels

cluttered. Simplify the layout. The

loading states are jarring. And if you

don't know what a loading state is,

pretty much when you go through the

different parts of the app, notice here

how it's loading fairly quickly. If I go

to settings and I go to API

documentation, it is almost

instantaneous. When I load a brand new

source, it is pretty quick. It wasn't

quick for most of this build. It was

actually very slow. And sometimes it

would show different parts like it's

called a skeleton. It shows the skeleton

of the page while it loads because it's

taking so much time. So if you choose to

embark on building something like this,

then once you have the 8020, you want to

start really focusing on the 20, which

is why is this taking 10 seconds to

load? Why is this showing weird

components on the page while it loads?

It's probably indicative of something

bigger or a bigger problem in the app

itself. Now, when it comes to quality

control, you can spin up a separate

session that goes and looks through the

codebase and make sure that the

important things are working that all

the API endpoints return real data.

Sometimes Cloud will do this really

naughty thing and create something with

fake data. So, it will say, "Oh, I'm

done the feature." But it's actually

lying to your face. It just put fake

data in that place. And that happened

with video where it said, "Okay, I know

how to use the video API. I just

generated a sample. Go and take a look.

And then I would go write a query,

generate the video, wait 3 minutes for

absolutely no reason. It basically ran a

simulation of how long it could take to

create said video and then created a

template that I physically couldn't

play. Which is why, again, the less you

take on at once, the better. Now, when

it comes to bug fixes, some people are

amazingly lazy at not giving the right

instructions. Because if you say the app

is slow, Claude has free reign to delete

things or remove things that could speed

it up at the cost of functionality,

security, etc. Ideally, you want to say

feature is not working as expected.

Current behavior, this is what happens.

Expected behavior, this is what should

happen. Steps to reproduce, step one.

Step two, investigate the root cause

before implementing a fix. Then you can

say ideally report back to me what you

plan on doing and which files might be

affected by you doing that thing. One

last thing I'll give you is the power of

exploration. So let's say you're

building the app and for whatever reason

maybe notbookm there's another product

where you love a particular feature. One

cool thing I like to do is log into that

service, give Claude permission to use

Claude and Chrome using that extension

to go and take over my browser, go and

interact with that app and learn about

the user journey or the service that I

really like or the component I really

like and try to recreate a version that

we can marry and merge into our existing

app. So the structure would be analyze

competitor reference ideally give it the

domain and say you're logged in and then

focus areas focus on the user flow from

start to end start to end rather how

they handle specific feature UI patterns

worth adopting features that are

missing. Sometimes one of the best ways

to get cloud code to be better at UI is

just tell it go and look at this website

go and navigate it and look at how

seamless it is how buttery it is. Look

at all the different colors that are

easy on the eyes that aren't as boring

and purpley as you do by default. So

with that, you now have an understanding

of the prompts, of the process, and more

importantly of the ultimate goal. So we

live in a world right now, which is why

I'm harping so much on cloud code this

year, where you can build 80 to 90% of

whatever you want. It's just a matter of

you putting in the work and

understanding what are all the piece

parts involved to create this beautiful

monstrosity of an app that you can run

locally and you want to build it

incrementally. So you want to focus on

building locally first and then once you

build locally you graduate to thinking

about what if someone else wanted to use

this if you want someone else to use

this. A good example of this are the API

endpoints. So when you go to the

settings here and you want to create the

API request builder this by default

cloud will make it local host. So it

will make it so you can run it locally

on your computer. If you have

microservices that you can interact with

you have to realize what you want as the

end state to tell cloud you know what I

want to deploy all these endpoints on a

server so other people can use it and I

can use things like nen or make to

access it remotely. So, like I said,

I'll make all the resources I just

showed you in this video available in

the second link in the description

below. Now, if you're feeling

particularly lazy and you just want the

outcome without all the hard work, then

naturally, I do make all of these repos

available to my exclusive community

members in my early AI adopters

community. So, if that interests you,

then check out the first link in the

description below. And lastly, and most

importantly to me, if you found this

video helpful, if you like this level of

depth, this probably took me 40 to 50

hours to build, plan, design it, and

create the story so I could try to

educate as best as I can all the

different processes that you'd need to

build something like this. So, if this

did help you, I would be infinitely

grateful if you left a comment on the

video or shared it with someone because

it helps the video, helps the channel,

and gives me the courage to take on

bigger builds like this and show them to

you. And with that, I'll see you the

next

Claude Code Just Built Me My Own NotebookLM

Mark Kashef

18 hours ago

42:20

Claude & Anthropic Ecosystem

Rank #2

Description

Join My Community to Level Up ➡ https://www.skool.com/earlyaidopters/about 🚀 Grab the Complete Build Kit: https://bit.ly/45UylIq 📅 Book a Meeting with Our Team: https://bit.ly/3Ml5AKW 🌐 Visit Our Website: https://bit.ly/4cD9jhG 🎬 Core Video Description What if you could build your own NotebookLM - but actually own it? In this build walkthrough, I rebuild Google's NotebookLM from scratch using Claude Code and make it genuinely useful for power users. You'll see exactly how to create a RAG-powered research platform with document chat, multi-person podcast generation, video explainers, deep research mode, and study materials like flashcards and quizzes. The key difference from Google's version: everything is API-first. The web UI is just another client - you can hit your own NotebookLM from n8n, Zapier, or any custom app through 50 REST endpoints. I walk through the complete tech stack (Supabase for database/auth/storage, Vercel for hosting, Gemini API for RAG and TTS, FastAPI backend, Next.js frontend) and show you the exact PRD documents and architecture diagrams used to build it. If you want to learn how to ship production apps with Claude Code, this is the blueprint. ⏳ TIMESTAMPS: 00:00 - Intro: Why I rebuilt NotebookLM from scratch 01:15 - The problem with Google's NotebookLM (no API, no ownership) 03:00 - Architecture overview: API-first design philosophy 05:00 - Tech stack breakdown: Supabase, Vercel, Gemini, FastAPI, Next.js 08:00 - Setting up the Supabase database schema 11:00 - RAG implementation: Document uploads and chunking 14:30 - Chat interface: Querying your documents with context 18:00 - Podcast generation: Multi-person audio with debate mode 22:00 - Video explainers with Alibaba's Wan 2.5 (10x cheaper than Veo) 26:00 - Deep research mode using Gemini's research API 29:30 - Study materials: Flashcards, quizzes, and study guides 33:00 - The 50 API endpoints explained 36:00 - Connecting to n8n and Zapier 38:30 - What's in the build kit: PRDs, diagrams, prompt templates 41:00 - Wrap up: Build it, own it, forever #NotebookLM #ClaudeCode #AI #Supabase #Vercel #RAG #AIAutomation #NoCode #BuildInPublic #GeminiAPI #FastAPI #NextJS #AITools #Anthropic #SelfHosted

Video Details

Category

Claude & Anthropic Ecosystem

Featured Date

January 16, 2026

Quality Rank

#2

AI Recommended