Loading video player...
In this video, I delve into a groundbreaking paper by DeepSeek called Engram that addresses the inefficiencies of transformer-based large language models (LLMs). Traditional LLMs use deep computation for both complex reasoning and simple recall, leading to wasted computational resources. Engram introduces a conditional memory mechanism that uses scalable lookup tables, effectively distinguishing between tasks that need deep computation and those that require simple memory recall. This approach has shown significant improvements in both knowledge and reasoning tasks, optimizing the efficiency and performance of LLMs. I also discuss the hardware implications and potential limitations of this new method. LINKS: https://github.com/deepseek-ai/Engram/tree/main https://github.com/deepseek-ai/Engram/blob/main/Engram_paper.pdf My voice to text App: whryte.com Website: https://engineerprompt.ai/ RAG Beyond Basics Course: https://prompt-s-site.thinkific.com/courses/rag Signup for Newsletter, localgpt: https://tally.so/r/3y9bb0 Let's Connect: 🦾 Discord: https://discord.com/invite/t4eYQRUcXB ☕ Buy me a Coffee: https://ko-fi.com/promptengineering |🔴 Patreon: https://www.patreon.com/PromptEngineering 💼Consulting: https://calendly.com/engineerprompt/consulting-call 📧 Business Contact: engineerprompt@gmail.com Become Member: http://tinyurl.com/y5h28s6h 💻 Pre-configured localGPT VM: https://bit.ly/localGPT (use Code: PromptEngineering for 50% off). Signup for Newsletter, localgpt: https://tally.so/r/3y9bb0 00:00 Memory problem with LLMs 00:25 Complex Reasoning vs. Simple Recall 00:56 The Inefficiency of Transformer-Based Architectures 01:22 DeepSeek's Engram: A New Approach 04:36 How Engram Works 07:30 Performance and Limitations of Engram