Loading video player...
ViDoRe V3 is a comprehensive test designed to evaluate how well Artificial Intelligence systems can retrieve and use information from complex, real-world documents,. Unlike earlier tests that focused mostly on plain text, this benchmark includes 26,000 pages full of visual elements like charts, tables, and images across ten different professional industries,. To create this resource, humans verified over 3,000 questions in six languages, ensuring the AI is tested on difficult, realistic tasks rather than simple facts,. The results showed that AI models that process document images outperform those that only read text, yet these systems still find it difficult to handle open-ended questions or pinpoint exactly where on a page they found their answer,. https://arxiv.org/pdf/2601.08620 https://hf.co/vidore