Loading video player...
Hello everyone. Thank you for joining my
talk. My name is Maxmillian Chhatawa. I
work at Perilin in Munich as a technical
consultant. And today I would like to
present to you a short result of
investigations that we did at our
clients with regards to semantic
retrievers or basically what is coming
beyond it. namely how we can not only
leverage embedding models, keyword-
based methods, but also LLMs that we
prompt in order to get more satisfying
results with regards to retrievers.
Let's first look at the challenges of
conventional retrieval setups. So what
usually happens is that a user of a
retrieval system in order to get a set
of documents out of a corpus forwards a
query that they want to query documents
about to a retriever. This retriever
then goes through a corpus with the help
of some kind of search or embedding
model and retrieves documents that fit
to this query best. So in case of an
embedding model that would be with
regards to semantic relatedness with
keyword-based models like BM25
that would be kind of the congrency of
of keywords between the query and
documents and then people end up with
retrieve with a set of retrieved
documents with the amount that they like
and either they can use it with LLMs but
they can also and this is the important
case here for us not use it with LLMs
but for example present it to the user
use it with some downstream data
application. The challenges that come
with the setup are that you always need
to take into account the entire query
when you do the retrieval process. So
you cannot focus on certain aspects or
add in certain like a certain focus on
some part. It's always a black box that
you cannot really control with this
conventional retriever setup. And also
there's no reasoning process behind
this. So that means that the order of
the documents
or the one that is favored the most by
the whole setup cannot be influenced by
a reasoning process. And this might be
of interest if you have some very
specialized domain where either semantic
models or keyword based models break
down because stuff is too similar for
these models. And then you would need a
reasoning process to yeah to find minute
details in the differences between the
documents and then use that as a basis
for reordering. And we at Perilin were
confronted with such a case at one of
our clients recently and we saw in the
quality the quantitative quality of the
retrieval results that we cannot just
make do with this kind of setup. we
would need to use an LLM in the
retrieval process to make it more fine
grained and have more fine grain
control. So what we did is that we use
the setup that I just presented. So
feeding a query to a retriever equipped
with corpus and embedding model to only
get preliminary documents. So with a
classic retrieval process, you would get
a preliminary set of documents, but then
you would leverage an LLM
to refine the set and refine the order
and then end up with the retrieved
documents that you're actually looking
for. And the nice thing about this is
that it offers an additional text input
dimension where you can refine aspects.
You can add metadata to your documents
at this stage that you want to be
leveraged or that you want to be
considered during the re-ranking
process. So this enables us then to now
use reasoning. So you can either use a
normal LLM or even a reasoning LLM to
reorder your documents. And you can
eject other aspects like metadata rules
for example also that could not be
captured by a classic nonlm retriever
model. And now looking at into the
detail of how we actually did this. We
actually used the LLM not just as a
oneoff reranker. So putting all the
documents into the prompt and then
having it select from them. But we used
the LLM as a binary document comparator.
So what we did is we put two documents
into a prompt and a query and then ask
the LLM this is the domain. Look at the
following two documents. Look at the
query. Which one do you find more
suitable to answering this query? Please
give me an answer. Either A or B or left
or right or up or down. And what we end
up with by doing this for all
combinations of documents is a matrix
that compares all the documents by
verdicts of the LLM. And then you can
see for example that document two is
more suited than document one and
document three is then less suited than
document one. And with this you can
establish an order of documents in the
end. Especially if you use the LLM
multiple times at a non-zero
temperature, you can get some statistics
to ground this ordering on and then you
can basically count the verdicts and
order the documents by that. And we
found in our project research that this
is more reliable than a one-off
invocation, especially if you're dealing
with many preliminary documents because
you would end up potentially with the
needle in the haststik problem and
documents would get lost in the LLM
processing. And this way you have more
control of this re-ranking process. But
and that is to be considered as well,
runtime and costs are of course an issue
here. So this takes way longer than just
a one-off LLM comparison or even a
classic retriever setup alone. And yeah,
the incurring costs are of course also
higher than if you just use an embedding
model or even a keyword- based model
because you have to do many basically on
the order of the square of the number of
documents LLM calls. So what is our
take-home message from this anecdotal
anecdotal talk about one of our projects
is that we have seen that conventional
retriever setups do have difficulties
especially with specialized corpora and
retrieval task with special needs like
certain aspects certain focuses to to
pay attention to and we recommend LLMs
as re-rankers in this case in a binary
comparison setup as I just explained
explained because we have seen that with
just comparing two documents the results
are quantitatively
better than with one of LLM reranking
setups. So thank you for listening and
looking forward to the other talks.
Read the abstract ➤ https://www.conf42.com/Prompt_Engineering_2025_Maximilian_Schattauer_retrievers_documents_ranking Other sessions at this event ➤ https://www.conf42.com/prompt2025 Join Discord ➤ https://discord.gg/yQneDJdJGV Chapters 00:00 Introduction and Speaker Background 00:30 Challenges of Conventional Retrieval Setups 02:54 Introducing LLMs for Enhanced Retrieval 04:09 Detailed Process of Using LLMs 05:26 Benefits and Considerations of LLM-based Retrieval 06:14 Conclusion and Takeaways