“Oops! We Automated Bullshit.”

“AI systems like ChatGPT are trained with text from Twitter, Facebook, Reddit, and other huge archives of bullshit, alongside plenty of actual facts (including Wikipedia and text ripped off from professional writers). But there is no algorithm in ChatGPT to check which parts are true. The output is literally bullshit, exactly as defined by philosopher Harry Frankfurt…”

– Alan Blackwell (2023, Nov 9), Oops! We Automated Bullshit.

LLMs cannot do reasoning or planning

“To summarize, nothing that I have read, verified or done gives me any compelling reason to believe that LLMs do reasoning/planning as it is normally understood. What they do, armed with their web-scale training, is a form of universal approximate retrieval which, as we have argued, can sometimes be mistaken for reasoning capabilities. LLMs do excel in idea generation for any task – including those involving reasoning, and as I pointed out, this can be effectively leveraged to support reasoning/planning. In other words, LLMs already have enough amazing approximate retrieval abilities that we can gainfully leverage, that we don’t need to ascribe fake reasoning/planning capabilities to them.”

– Subbarao KambhampatiΒ  (2023, Sept 12). Can LLMs Really Reason and Plan? blog@CACM, Communications of the ACM.

Harvey at PwC

LLM-driven text analysis is becoming a norm, allowing people to process huge volumes of text they wouldn’t otherwise have the capacity to do. Although outputs can be checked, the large volume of inputs processed means there are fundamental limits on how comprehensively analyses can be checked.

PwC announced yesterday that it is trialling the use of Harvey, built on Chat GPT, to “help generate insights and recommendations based on large volumes of data, delivering richer information that will enable PwC professionals to identify solutions faster.”

They say that “All outputs will be overseen and reviewed by PwC professionals.” But what about how the data was processed in the first place…?