In Loving Memory of [Name]

It is with a heavy heart and a touch of irony that we share the news of [Name]’s passing on [Date]. [He/She/They] was a unique individual who left an indelible mark on the lives of those who had the privilege of knowing [him/her/them].

[Name] was a staunch advocate for authenticity and a fierce critic of anything that felt impersonal or detached. [He/She/They] valued the human touch in every aspect of life, from personal connections to the way [his/her/their] story was told. In an ironic twist, [Name] would have likely found it amusing that [his/her/their] obituary, crafted to honor [his/her/their] disdain for artificial intelligence, was indeed generated using ChatGPT.

In a world increasingly dominated by technology, [Name] stood firm in [his/her/their] belief that there are certain aspects of life that should remain deeply human. [He/She/They] believed in the power of personal narratives, the warmth of handwritten letters, and the irreplaceable touch of a heartfelt conversation.

[Name] lived a life guided by principles of authenticity and genuine connection, and the irony of this AI-generated tribute adds a touch of humour to the remembrance. [He/She/They] approached relationships with sincerity, always valuing the individuality and uniqueness of each person [he/she/they] encountered.

As we bid farewell to [Name], let us honour [his/her/their] memory with a wry smile, embracing the qualities [he/she/they] held dear – authenticity, connection, and the celebration of the human experience. [Name]’s departure leaves a void that cannot be filled, but [his/her/their] legacy of genuine, heartfelt living and the irony in this farewell will continue to inspire us all.

May [Name] rest in peace, free from the technological hum that [he/she/they] found so disconcerting. And may we, the ones left behind, strive to preserve the authenticity that [he/she/they] cherished so deeply, even if it means acknowledging the unexpected irony in this tribute. Inspired by [Name]’s values and with a touch of humour, we bid [him/her/them] farewell.

Exploring the validity of sentiment analysis in psychotherapy

What happens if you apply the Multilingual Language Model Toolkit for Sentiment Analysis (XLM-T), “a transformer-based NLP model derived from the Cross-lingual Language Model based on RoBERTa”, to psychotherapy transcripts? Eberhardt et al. (2024) investigate. Here’s a slightly simplified Table 1, showing correlations between positive and negative sentiment and a patient-reported emotions scale. Green gives between-patient correlations and pink within-patient across sessions. Note the wide confidence intervals.

Eberhardt, S. T., Schaffrath, J., Moggia, D., Schwartz, B., Jaehde, M., Rubel, J. A., Baur, T., AndrΓ©, E., & Lutz, W. (2024). Decoding emotions: Exploring the validity of sentiment analysis in psychotherapy. Psychotherapy Research.

Accuracy

Be wary of evaluations of AI classifiers that focus only on predictive accuracy as a single percentage. Instead, as a bare minimum, we need to separate quantification of how well AI is doing when it says yes and when it says no.

Predictive accuracy is strongly determined by how common the true yes and no answers are. If a system is designed to try to diagnose a rare disease, the easiest way to increase accuracy is always to predict no, since that prediction will be correct for most cases.

Instead, we need to know how frequently is AI missing diseases that have a treatment and how often it is over-diagnosing and leading to potentially harmful overtreatment.

Another application of AI that we have all been using for decades is spam detection. In this domain, we need to know how often a spam filter is exposing us to cryptocurrency grifters (for example) and how often it is quietly deleting important emails.

AI systems allow a threshold to be tweaked that pushes them towards under- or over-diagnosis. How you choose the threshold depends on what action you will take as a result of the classification, e.g., will users just be exposed to a little extra spam that they can swiftly delete or will they be subjected to unnecessary chemotherapy.

AI in Cabinet Office

Some interesting information beginning to appear, e.g., an Algorithmic Transparency Recording Standard (ATRS) for automated document review to detect ROT (redundant, outdated, and trivial information) for deletion (see here – you may want to copy-paste it into Excel for ease of reading).

There is also more information beginning to appear (see also follow up) on the Red Box Copilot, an approach to using AI to help prepare ministerial papers. Alex Burghart, Parliamentary Secretary (Cabinet Office) writes (24 January 2024):

“The Red Box Copilot has been made available to the Private Offices of Minister Burghart, of the Cabinet Secretary [Simon Case], and of the Chief Operating Officer of the Civil Service [Alex Chisholm], in which it is either currently or will shortly be going through more formal testing.”

“Oops! We Automated Bullshit.”

“AI systems like ChatGPT are trained with text from Twitter, Facebook, Reddit, and other huge archives of bullshit, alongside plenty of actual facts (including Wikipedia and text ripped off from professional writers). But there is no algorithm in ChatGPT to check which parts are true. The output is literally bullshit, exactly as defined by philosopher Harry Frankfurt…”

– Alan Blackwell (2023, Nov 9), Oops! We Automated Bullshit.

LLMs cannot do reasoning or planning

“To summarize, nothing that I have read, verified or done gives me any compelling reason to believe that LLMs do reasoning/planning as it is normally understood. What they do, armed with their web-scale training, is a form of universal approximate retrieval which, as we have argued, can sometimes be mistaken for reasoning capabilities. LLMs do excel in idea generation for any task – including those involving reasoning, and as I pointed out, this can be effectively leveraged to support reasoning/planning. In other words, LLMs already have enough amazing approximate retrieval abilities that we can gainfully leverage, that we don’t need to ascribe fake reasoning/planning capabilities to them.”

– Subbarao KambhampatiΒ  (2023, Sept 12). Can LLMs Really Reason and Plan? blog@CACM, Communications of the ACM.

Software crisis

History has valuable lessons on AI. Many of us are aware of the replication crisis in social science. Were you aware of the software crisis, first famously discussed at the NATO Conference on Software Engineering in Garmisch, October 1968? We have got used to software being buggy, updates being required on a near-daily basis, often to fix security vulnerabilities – and, given the vast number of high profile cyber attacks, often too late. People are now suggesting using large language models, trained on code people have dumped on the web, to write software. Software testing and static program analysis are going to be more important than ever, whether you’re evaluating internet-connected apps or statistical analysis code.

The original reports are available online. It’s worth having a browse around to see the issues. In 1968, hardware and software had a tiny fraction of the computational and political power it has now.