Loading video player...
[Music]
Hi, I'm Alice.
I'm Lucas. We're product managers at
Google DeepMind.
And today we are incredibly excited to
introduce Embedding Gemma, our
state-of-the-art embedding model
designed for mobile first AI. Embedding
Gemma is a 300 million parameter text
embedding model designed to power
generative AI experiences directly on
your hardware.
Embeddings are numerical representations
of data. This model transforms text like
messages, emails or notes into a vector
of numbers to represent meaning in a
highdimensional space that a generative
model can then use for downstream tasks.
Embedding Gemma is small, fast, and
efficient. Thanks to quantization aware
training, you can run the model with as
little as 300 megabytes of RAM while
preserving state-of-the-art quality.
It generates embeddings of 768
dimensions, but thanks to MROSKA
representation learning, you can
customize the model's output dimensions
and go down to 128. Based on the same
technology and research that powers our
Gemini embedding models, embedding Gemma
brings that state-of-the-art capability
in a smaller and more lightweight model.
Think highquality semantic search, fast
and relevant information retrieval, or
customized classification and
clustering, just to name a few
opportunities. Embedding Gemma achieves
the best score on the comprehensive
massive text embedding benchmark for
models under 500 million parameters. The
gold standard for text embedding
evaluation
trained across 100 plus languages.
Embedding Gemma brings proven
performance to instantly connect with
diverse and global audiences. We've
engineered embedding Gemma specifically
for ondevice performance to ensure
efficient computations and minimal
memory footprint even on resource
constrained hardware. Embedding Gemma
facilitates ondevice embedding of local
documents. So sensitive user data never
leaves the device. And because it works
offline, it means Frontier search and
retrieval features work regardless of
connectivity. Together with our
generative models like Gemma 3N, you can
build powerful mobile first generative
AI experiences and efficient retrieval
augmented generation pipelines. This
means your applications can now leverage
user context from data to provide more
personalized and helpful responses such
as understanding that you need your
carpenters's number for help with
damaged floorboards.
Here's an example of what embedding
Gemma can power. What you are seeing is
how a user can utilize embedding Gemma
to query previously opened articles or
other web pages. The model embeds each
page as it's opened in real time. Then
with a browser extension that uses
embedding Gemma, the user can ask a
question to retrieve the contextually
relevant articles. And because the
embeddings are created on device, all
this is happening without leaving the
user's hardware.
And it's designed with customization in
mind. fine-tune embedding Gemma for your
domain or in a particular language. It
works across popular tools and platforms
such as hugging face and Kaggle. Check
out our notebook examples part of the
Gemma cookbook to get started. Our next
generation of ondevice embedding models
is here and it's open for everyone. It's
small, fast, and efficient. Download
Embedding Gemma and get started building
right now.
You can find links in the description
below. We can't wait to see what
Embedding Gemma unlocks for you.
[Music]
Discover EmbeddingGemma, a state-of-the-art 308 million parameter text embedding model designed to power generative AI experiences directly on your hardware. Ideal for mobile-first Al, EmbeddingGemma brings powerful capabilities to your applications, enabling features like semantic search, information retrieval, and custom classification – all while running efficiently on-device. In this video, Alice Lisak and Lucas Gonzalez from the Gemma team introduce EmbeddingGemma and explain how it works. Learn how you can run this model on less than 200MB of RAM with quantization, customize its output dimensions with Matryoshka Representation Learning (MRL), and build powerful offline Al features. Resources: Learn about EmbeddingGemma → https://developers.googleblog.com/en/introducing-embeddinggemma EmbeddingGemma documentation → https://ai.google.dev/gemma/docs/embeddinggemma Gemma Cookbook → https://github.com/google-gemini/gemma-cookbook Quickstart RAG notebook → https://github.com/google-gemini/gemma-cookbook/blob/main/Gemma/%5BGemma_3%5DRAG_with_EmbeddingGemma.ipynb Discover Gemma models → https://deepmind.google/models/gemma Chapters 0:00 - Intro 0:26 - Model overview 1:18 - Model features 2:29 - RAG 2:54 - Website embedding demo 3:23 - Tools and platforms 3:41 - Conclusion Subscribe to Google for Developers → https://goo.gle/developers Speaker:Alice Lisak Lucas Gonzalez Products Mentioned: Google AI, Gemma,Generative AI