Loading video player...
Title: Mafin: Enhancing Black-Box Embeddings with Model Augmented Fine-Tuning (Feb 2024) Link: http://arxiv.org/abs/2402.12177v4 Date: February 2024 Summary: Introduces Mafin, an approach for fine-tuning a black-box embedding model by augmenting it with a trainable embedding model. It enhances performance with a small augmented model, validated on labeled and unlabeled datasets. Key Topics: - Retrieval Augmented Generation (RAG) - Black-box embedding - Fine-tuning - Model augmentation - Information retrieval Chapters: 00:00 - Introduction to Domain Specificity 00:15 - Introducing MAFIN 00:25 - The Challenge of Domain Knowledge 00:36 - MAFIN Explained 00:47 - Augmenting Black Box Models 01:05 - Efficiency Play 01:19 - Performance Boosts 01:38 - Domain Adaptation 01:49 - Retrieval Augmented Generation 02:01 - Weaknesses of LLMs 02:26 - Retrieval Step Critical 02:36 - Dense Retrieval Methods 02:55 - Domain-Specific Tuning 03:12 - The Gap 03:19 - MAFIN Architecture 03:34 - Practical Considerations 04:11 - Key Trade-Off 04:36 - Strategic Trainable Model 04:58 - Normalization Step 05:31 - Unit Norm Importance 06:14 - Lambda-MAFIN 07:13 - Standard MAFIN Focus 07:31 - Learning to Rank 07:47 - Placket Loose Model 08:07 - The Problem 08:26 - Efficiency Trick 08:41 - Simplifying the Objective 09:11 - Cool Connection 09:27 - Info NCE 10:08 - Unsupervised Case 10:30 - Synthetic Query Generator 11:12 - Simplifying Assumption 11:39 - Classification Task 12:17 - Does It Work? 12:27 - Domain-Specific Datasets 13:05 - NDCG at K 13:25 - Implementation Details 13:59 - Compared to Other Approaches 14:29 - The Results 14:52 - Numbers 15:20 - What About the Cost? 15:57 - Unsupervised Version 16:20 - Interesting Finding 16:54 - Nathan 17:39 - Synthesize 17:46 - Core RRA Dilemma 18:15 - Best of Both Worlds 18:35 - Flexible 18:42 - Used Elsewhere? 19:14 - Final Thought