If a partner or friend made stuff up a significant percentage of the time that you asked a question, it would be a huge problem for the relationship. But apparently it’s different for OpenAI’s hot new model. Using SimpleQA, the company’s in-house factuality benchmarking tool, OpenAI admitted in its release announcement that its new large language model (LLM) GPT-4.5 hallucinates — which is AI parlance for confidently spewing fabrications and presenting them as fact — 37 percent of the time. Yes, you read that right: in tests, the latest AI model from a company that’s worth hundreds of billions of dollars is telling lies for more than one out of every three answers it gives. As if that wasn’t bad enough, OpenAI is actually trying to spin GPT-4.5’s bullshitting problem as a good thing because — get this — it doesn’t hallucinate as much as the company’s other LLMs. The same graph [can we embed a screenshot below?] that showed how often the new model spews nonsense also reports that GPT-4o, a purportedly advanced “reasoning” model, hallucinates 61.8 percent of the time on the SimpleQA benchmark. OpenAI’s o3-mini, a cheaper and smaller version of its reasoning model, was found to hallucinate a whopping 80.3 percent of the time. Of course, the problem isn’t unique to OpenAI. “At present, even the best models can generate hallucination-free text only about 35 percent of the time,” explained Wenting Zhao, a Cornell doctoral student who co-wrote a paper last year about AI hallucination rates, in an interview…OpenAI Admits That Its New Model Still Hallucinates More Than a Third of the Time