Failing Grade A team of researchers at Facebook’s parent company Meta has come up with a new benchmark to gauge the abilities of AI assistants like OpenAI’s large language model GPT-4. And judging by current standards, OpenAI’s current crop of AI models are all… still pretty stupid. The team, which includes “AI godfather” and Meta chief scientist Yann LeCun, came up with an exam called GAIA that’s made up of 466 questions that “are conceptually simple for humans yet challenging for most advanced AIs,” per a yet-to-be-peer-reviewed paper. The results speak for themselves: human respondents were capable of correctly answering 92 percent of the questions, while GPT-4, even equipped with some manually selected plugins, scored a measly 15 percent. OpenAI’s recently-released GPT4 Turbo scored less than ten percent, according to the team’s published GAIA leaderboard. It’s unclear, however, how competing LLMs like Meta’s own Llama 2 or Google’s Bard fared. Nonetheless, the research demonstrates that we’re likely still a long way away from reaching artificial general intelligence (AGI), the state at which AI algorithms can outperform humans in intellectual tasks. Stupid Lawyer That conclusion also flies in the face of some lofty claims made by notable figures in the AI industry. “This notable performance disparity contrasts with the recent trend of LLMs outperforming humans on tasks requiring professional skills in e.g. law or chemistry,” the researchers write in their paper. Case in point, in January OpenAI competitor Anthropic claimed its AI dubbed Claude got a “marginal pass” on a blindly graded…Facebook Researchers Test AI's Intelligence and Find It Is Unfortunately Quite Stupid