Vision language models struggle to solve simple visual puzzles that humans find intuitive

lecrab 28 October 2024

GPT-4o, currently considered the most advanced multimodal model, could only solve 21 out of 100 visual puzzles. Other well-known AI models, including …

See more –> Source

Connect with us on X

AI bing chatgpt gpt

lecrab

View All Posts