Leveraging LLMs for question-based document ranking | Maximilian Schattauer | Conf42 Prompt 2025 | DailyDevLists

Loading video player...

Full Transcript

1,091 words • EN

Hello everyone. Thank you for joining my

talk. My name is Maxmillian Chhatawa. I

work at Perilin in Munich as a technical

consultant. And today I would like to

present to you a short result of

investigations that we did at our

clients with regards to semantic

retrievers or basically what is coming

beyond it. namely how we can not only

leverage embedding models, keyword-

based methods, but also LLMs that we

prompt in order to get more satisfying

results with regards to retrievers.

Let's first look at the challenges of

conventional retrieval setups. So what

usually happens is that a user of a

retrieval system in order to get a set

of documents out of a corpus forwards a

query that they want to query documents

about to a retriever. This retriever

then goes through a corpus with the help

of some kind of search or embedding

model and retrieves documents that fit

to this query best. So in case of an

embedding model that would be with

regards to semantic relatedness with

keyword-based models like BM25

that would be kind of the congrency of

of keywords between the query and

documents and then people end up with

retrieve with a set of retrieved

documents with the amount that they like

and either they can use it with LLMs but

they can also and this is the important

case here for us not use it with LLMs

but for example present it to the user

use it with some downstream data

application. The challenges that come

with the setup are that you always need

to take into account the entire query

when you do the retrieval process. So

you cannot focus on certain aspects or

add in certain like a certain focus on

some part. It's always a black box that

you cannot really control with this

conventional retriever setup. And also

there's no reasoning process behind

this. So that means that the order of

the documents

or the one that is favored the most by

the whole setup cannot be influenced by

a reasoning process. And this might be

of interest if you have some very

specialized domain where either semantic

models or keyword based models break

down because stuff is too similar for

these models. And then you would need a

reasoning process to yeah to find minute

details in the differences between the

documents and then use that as a basis

for reordering. And we at Perilin were

confronted with such a case at one of

our clients recently and we saw in the

quality the quantitative quality of the

retrieval results that we cannot just

make do with this kind of setup. we

would need to use an LLM in the

retrieval process to make it more fine

grained and have more fine grain

control. So what we did is that we use

the setup that I just presented. So

feeding a query to a retriever equipped

with corpus and embedding model to only

get preliminary documents. So with a

classic retrieval process, you would get

a preliminary set of documents, but then

you would leverage an LLM

to refine the set and refine the order

and then end up with the retrieved

documents that you're actually looking

for. And the nice thing about this is

that it offers an additional text input

dimension where you can refine aspects.

You can add metadata to your documents

at this stage that you want to be

leveraged or that you want to be

considered during the re-ranking

process. So this enables us then to now

use reasoning. So you can either use a

normal LLM or even a reasoning LLM to

reorder your documents. And you can

eject other aspects like metadata rules

for example also that could not be

captured by a classic nonlm retriever

model. And now looking at into the

detail of how we actually did this. We

actually used the LLM not just as a

oneoff reranker. So putting all the

documents into the prompt and then

having it select from them. But we used

the LLM as a binary document comparator.

So what we did is we put two documents

into a prompt and a query and then ask

the LLM this is the domain. Look at the

following two documents. Look at the

query. Which one do you find more

suitable to answering this query? Please

give me an answer. Either A or B or left

or right or up or down. And what we end

up with by doing this for all

combinations of documents is a matrix

that compares all the documents by

verdicts of the LLM. And then you can

see for example that document two is

more suited than document one and

document three is then less suited than

document one. And with this you can

establish an order of documents in the

end. Especially if you use the LLM

multiple times at a non-zero

temperature, you can get some statistics

to ground this ordering on and then you

can basically count the verdicts and

order the documents by that. And we

found in our project research that this

is more reliable than a one-off

invocation, especially if you're dealing

with many preliminary documents because

you would end up potentially with the

needle in the haststik problem and

documents would get lost in the LLM

processing. And this way you have more

control of this re-ranking process. But

and that is to be considered as well,

runtime and costs are of course an issue

here. So this takes way longer than just

a one-off LLM comparison or even a

classic retriever setup alone. And yeah,

the incurring costs are of course also

higher than if you just use an embedding

model or even a keyword- based model

because you have to do many basically on

the order of the square of the number of

documents LLM calls. So what is our

take-home message from this anecdotal

anecdotal talk about one of our projects

is that we have seen that conventional

retriever setups do have difficulties

especially with specialized corpora and

retrieval task with special needs like

certain aspects certain focuses to to

pay attention to and we recommend LLMs

as re-rankers in this case in a binary

comparison setup as I just explained

explained because we have seen that with

just comparing two documents the results

are quantitatively

better than with one of LLM reranking

setups. So thank you for listening and

looking forward to the other talks.

Leveraging LLMs for question-based document ranking | Maximilian Schattauer | Conf42 Prompt 2025

Conf42

10 days ago

7:02

YouTube - AI & Machine Learning

Rank #5

Description

Read the abstract ➤ https://www.conf42.com/Prompt_Engineering_2025_Maximilian_Schattauer_retrievers_documents_ranking Other sessions at this event ➤ https://www.conf42.com/prompt2025 Join Discord ➤ https://discord.gg/yQneDJdJGV Chapters 00:00 Introduction and Speaker Background 00:30 Challenges of Conventional Retrieval Setups 02:54 Introducing LLMs for Enhanced Retrieval 04:09 Detailed Process of Using LLMs 05:26 Benefits and Considerations of LLM-based Retrieval 06:14 Conclusion and Takeaways

Watch on YouTube

Video Details

Category

YouTube - AI & Machine Learning

Featured Date

November 7, 2025

Quality Rank

#5

AI Recommended