Loading video player...
Join us for a hands-on livestream where we'll build an OCR AI agent that can read text from images and dynamically work with the results. We'll demonstrate how to use this agent to read handwritten recipes and make updates to the text with minimal instruction, such as converting measurements between different units. We'll show you how LangGraph enables us to easily create a lightweight ReAct agent, how OpenCV can preprocess images for better OCR results, and how you can choose between proprietary and open-source models to extract text and power your agentic workflow. Throughout the tutorial, we'll leverage PyCharm's new AI toolkit functionality to help us select the right vision model and LLM, while debugging our agent in real-time. You'll see how traditional computer vision techniques can be seamlessly coupled with cutting-edge AI agents to solve real-world problems! Code repo for this episode: https://github.com/t-redactyl/ocr-llm-agent HuggingFace Agents course: https://huggingface.co/learn/agents-course/en/unit0/introduction Become a paid member of the channel to help us make more episodes https://www.youtube.com/channel/UCkrcW82Y2kbgU-U9RaYfgxw/join Watch along for your chance to win during our live trivia segment, and participate in the live Q&A session with questions from you in the audience. Got a cool project of your own? Send it to us and you may be featured https://www.jotform.com/form/233105358823151 OpenCV is a 501(c)(3) registered non-profit in the United States. See how you can support open source CV & AI: http://opencv.org/support/