Automating knowledge work with AI agents - LlamaIndex @ Fully Connected London '25 | DailyDevLists

Loading video player...

Full Transcript

2,402 words • EN

Thank you for the for the introduction.

Um,

ideally it was not gonna be me talking

today. Um, so I'm here as a replacement

forana. I don't want to steal her

thunderish. These are her slides. So

she's still here with us. Uh,

>> she's alive.

>> Yes. Yes, she she is. She is.

So the idea for today is to think about

what kind of challenges we face when

doing what we traditionally call

knowledge work and how can we use AI to

solve those challenges.

And to do that, I I'm going to take this

example probably to its extreme. Um,

and think about an invoice.

We all use invoices in our daily lives.

Most often when we buy a coffee from our

local coffee shop and we might proceed

to throw it away. But for a lot of

business applications, keeping track of

invoices is quite important.

And one use case that we see often and

often it's the problem of invoice

reconciliation. That is I have a

contract that says um this company is

going to provide these services over

let's say a year and every month they're

going to send me an invoice. Now

most of the time there's humans involved

in creating invoices. So an invoice

might not exactly be what I expect in my

contract. And I want to be able to check

those invoices to make sure that they

are what we agreed upon. This is one

example that we see a lot in the

industry where we have these lots of

providers, lot of contracts, lot of

invoices and we want to make sure that

all these invoices make sense. So how do

we approach a problem like this

um

with AI?

And so it's it's not trivial that I can

just throw the contract, throw the image

and say, is this correct? Like we we

would love that to be the case, but

there are some nuances that make this

process a little bit harder.

So let me run through an example on how

this may work. Here we have uh I have

uploaded a couple of contracts. I have

uploaded a couple of invoices. And so

one of those contracts might specify the

price or the delivery date for

particular items,

but they don't match the invoice. And I

want to be very quickly be able to

highlight

what's going on. So that that's sort of

what it looks to to do invoice

reconciliation. And if you're interested

in looking at a demo uh more closely, we

are having a demo later today. So do

come to the to the demo.

So

what what makes this problem slightly

harder than what it seems? Why can I

just show or throw my contract my

invoice to judge GVD and ask like is

this fine? And it turns out that most

LLMs or textbased we are now getting

multimodal LLMs but contracts are very

comp PDFs are very complex structures um

made for human understanding where

layouts, images, plots um or are very

easy for a human to understand but not

as CC for um an LLM.

And so

there's there's a lot of work that needs

to be done to extract all this

information from these documents before

they can be they can fed to an LM. And

that's a crucial part of getting these

applications right. Making sure that

whatever you're feeding your large

language model, it's exactly what they

need to solve this problem. And that's

where Llama index comes in.

So how how might we solve this problem?

And this is like a very rough

approximation. I'm going to show a more

complete example um in a second. The

idea is that we might first upload a a

contract and then that contract is pors.

And so if I I do this every day, so I'm

very familiar with this, but the idea is

that we have a PDF and we're going to

extract text from that. And we often end

up with a markdown representation which

is basically plain text of that

contract. That's what I mean by parsing.

Um for this particular application I

might need to find specific clauses on

that contract. Um and contracts might be

very large. So I need to do look up of

different parts of the text. This can be

achieved for example using rag. And so

for these particular applications once

we parse this contract once we have this

text representation of the contract we

are going to throw it into an index uh

into a vector database. We have llama

index uh in lama cloud for this and we

have the first step of this process

ready which is processing the contract.

And again, contracts that look like this

might have um an address somewhere on

the top. It might have a table. Again,

these are complex structures that are

easy for humans, but not necessarily so

much for lens.

And this is how a contract that sparse

might look like. You can see the table

there at the end. I can is how nice.

Then again here

once we have the contract then the next

step it's the invoice. What do I need?

What might I do with the invoice? First

step is always parsing again. Uh we need

to get all this information as text so

that we can do more interesting things

with it. Um but for invoice we actually

need a lot more details. We need to know

for example the name of the company and

we need to find the name of the company

that invoice. We might need to get all

the line items and exactly what was the

amount for each line item. Um

so for this we have a

a product called pars that I'm going to

mention in a second

that allows us to basically solve this

step at once.

So what is llama index and what do we

do? Um

we provide at the very core level

these building blocks that are necessary

for doing knowledge work with AI pores

extract classification of documents. All

these are essential tools that you will

need eventually if you're building

applications that involve documents.

You might know us better for our open

source framework

um that builds upon these tools to build

complex AI um applications and AI

agents. And we are starting to release

agent templates

um that allow you to very quickly

prototype u complex agentic applications

that use documents as their main input

and transform them into whatever you

need to build. for example, in

reconciliation. So the idea is that the

example that I show,

you should be able to get started with

one of the templates and in a very few

minutes get up and running with a

properly agenda application that

ingests real documents.

So about

the core building blocks, I think that

llama is is really

something special. It's the building

block that allows you to go from PDF to

text. Um, and there is a lot of

optimization behind the scenes to make

sure that we understand layouts, that we

understand images, that we understand

tables, that we understand charts. And

so, regardless of how complex your

document is, there's a good chance that

you're going to get much better results

by parsing it first and then using an

LM.

Now if what you need is structure

output. For example, for an invoice I

need line items like what was charged,

how much was it, when was it, then I

usually want a JSON output for my task.

And here is where extract shines.

Extracts allow you to specify a schema.

These are the things that I want to get.

and

ingests your document, parses it behind

the scenes and then gives you the

structure output that is essentially

what you want.

And so here you can see we have a

feeling from Nvidia.

I might only care about certain specific

fields and so I can just specify that as

a schema and I get that um those values

only.

And it's very easy to get started. You

can build your own schemas, but we have

a a few that are prefilled for you to

use. We also have a mode that

automatically suggests a schema for you.

So again, it's very easy to to get

started.

If you're a developer, you can also go

the programming route and you can do

this by specifying a py schema here. The

key is that these descriptions field

here are what the LM will use behind the

scenes to understand what's going on.

And so they're really really important

and so if you skim on this results might

not be as good

but again it's just a couple of lines to

get sorted. Uh also if you go the code

route um we are going to share the

slides after the talk but there is a

couple of QR codes um here. This one

takes you to tutorial on how to use

extract for these feelings.

Um now

if you want to build

AI applications,

agentic applications at some point you

might need to start putting all these

building blocks together. And so

in the ideal world, we just write agents

or function agents where we specify a

list of tools and

the agent automatically knows what tool

to call in what order to get the right

output and that works great. In

practice, we sometimes have specific

logic that we want to make sure is

executed in the right order and some of

those decisions will be agentic. We

might have agents that solves particular

tasks, but we want a specific flow um of

our application. And here is where agent

workflows shine. One of our core

open-source contributions is the idea of

agent workflows that allow you to

build any application specifying

different steps and how those steps

connect. Um the idea is that this is

event driven. is built for agents. Most

of our agent abstractions are built on

top of this. Um, and this is a great way

to build agenda applications. This is

what a the simplest workflow might look

like. We can define steps. These steps

just print and do nothing. And then

steps communicate by specifying events.

So first step emits an event. Step two

wait expects that event. And so when it

sees it's firing from this, we'll

execute. And this way you can build a

very distributed application by specifi

specifying when this thing should run.

This is really useful when you're

processing documents where a task might

take a minute for example and this other

task needs to fire only when this is

done. Uh this manages all that for you

uh without uh been blocking.

And so we also have tools to visualize

some of the workflows. And so if you run

this workflow, you might get a

visualization such as this one.

Back to our invoice reconciler. If I

were to implement this as a workflow,

this is what this workflow might look

like. I have a step that uploads a file.

Um, I might have a classification step

that we have a classification model that

decide if this is an invoice or a

contract. If it's an invoice, I might

emit an invoice event and I'm going to

sort of start the workflow that deals

with invoices. So, I'm going to parse

the invoice. I'm going to

extract it. Um, I'm going to go back to

the contract and I'm going to reconcile

it. If I have a contract, I'm going to

get the name from the contract. I'm

going to index that contract. I'm going

to say that contract was indexed.

And when I'm reconciling an invoice

here, basically what I have it's the

extracted invoice. I have the extracted

contract and I'm going to use a

structure output lm. This is not

necessarily an agent. Uh and say this is

what my output should look like. um from

the PORs invoice and from the PORs

contract try to find if there are any

discrepancies and this step could be as

complex as as you want it to be.

We are also building tools to make it as

easy as possible to deploy agent. So if

you build an agentic application with a

workflow, we want to make sure that you

can deploy it very fast. And so in this

case we developed a tool called llama

that is the one that has all these

agentic templates. So if you type lacl

in it this will

very quickly um allow you to generate

one of these templates and then you can

deploy it you can run it locally but you

also get all the code. So if you

actually want to make any changes to the

application, you want to change the

schema, you want to change the UI, uh

everything is there um for you to to

customize.

We are going to be doing a demo of

exactly this and how this works and how

you can get started deploying an agent

on the demo later. So I think it's 5:30

demo booth is over there.

Um,

finally

we wrote a paper on how to use lama

index with weight and biases. You can

find the link to that blog here uh if

you're interested.

I say we is was but again I'm still in

her talk so I I get to do this again. Uh

if you're interested in how

when to use pores, when to use extract,

um what is OCR and why we might need

pors and not just OCR and a document,

these are two interesting blog posts

that will be linked with the talk. Uh if

you're interested in getting started

with agents, this is your main QR code

that will take you to

the demo how to properly build these

applications.

And this is it. We are, as I mentioned,

we have our demo and we have swag. So,

if you're interested in one of these

very cute hats or have any questions

about Llama Agents, happy to approach me

or Mutasa there, just search for the

hats and we're happy to exchange uh one

of these hats for your time.

All right.

Thank you.

Automating knowledge work with AI agents - LlamaIndex @ Fully Connected London '25

Weights & Biases

85 days ago

16:27

Ai Whitelist

AI Whitelist

Rank #2

Description

In this session from Fully Connected London, Diego Kiedanski, Founding AI Engineer at LlamaIndex, covers how common knowledge work is being automated with the latest AI technology. Most human knowledge remains locked in complex documents and file types like PDFs, tables, and content with irregular layouts that still hold valuable context. Diego explores parsing and extracting technologies that work alongside AI agents to make truly automated knowledge work a reality. He also introduces LlamaAgents, a new framework that allows you to serve and deploy these assistants at scale.

Watch on YouTube

Video Details

Category

Feed

AI Whitelist

Featured Date

December 7, 2025

Quality Rank

#2

AI Recommended