Ignorance of history in evaluation

“Despite occasional statements that program theory is a new approach, its roots go back more than fifty years. […] The history of program theory evaluation is not one of a steady increase in understanding. Instead, many of the key ideas have been well articulated and then ignored or forgotten in descriptions of the approach. It is not unusual to have statements that demonstrate a lack of knowledge of previous empirical and theoretical developments, such as a call for proposals from the Agency for Healthcare Research and Quality (2008) that claimed that โ€œโ€˜theory-based evaluationโ€™ is a relatively new approachโ€ (p. 14).”

Sue C. Funnell and Patricia J. Rogers (2011, pp. 15โ€“16). Purposeful Program Theory: Effective Use of Theories of Change and Logic Models. Jossey-Bass.

A Mirror

A Mirror (Trafalgar Theatre, Feb 2024) is an immersive play exploring political theatre within oppressive regimes, drawing inspiration from the playwright’s holiday in North Korea. Our role in the (actual) audience was as a (fictional) audience attending an illegal performance in an unnamed “motherland”. The actors did an fantastic job with the material received; however, their impact was marred by a text riddled with cheap theatrical tactics to try to move and terrify the audience.

The play’s reliance on cliches, such as the power dynamics of the male boss flirting with his female subordinate and a predictable liaison between her and a dissident playwright, felt uninspired. Being charitable, perhaps these gendered dynamics are so common that they are drearily boring, and the play successfully captured that mundane quality. An early attempt at humour, at the expense of sex workers, added a distasteful note to the narrative, though drew hearty laughter throughout the theatre.

The pivotal immersive event, a police raid, while genuinely frightening, resorted to muscled actors donning cop uniforms and balaclavas, stomping around the theatre brandishing batons in a simulated show of authority. This stunt, seemingly intended to evoke gratitude for allowing plays like A Mirror and inspire political protest if this freedom were ever challenged, came across as the most contrived. It is trivial to scare an audience if you fill a theatre with muscly (actors playing) thugs and it worked – I walked two miles before getting the tube home rather than hop on at the station ten minutes away. I needed to walk off the fear and fury I felt. But my fury was at the playwright, director, and thug actor, rather than the State (actual or imagined).

While there was an underlying message about freedom and the complicity of even apparently progressive civil servants in State oppression, it struggled to resonate amid the self-indulgence of the production. Think Toast of London but emotionally violent, with all the self-awareness stripped out. I was left struggling to hear the potentially profound political message above the theatrics.

Everything changes, by Bertolt Brecht

Everything changes. You can make
A fresh start with your final breath.
But what has happened has happened. And the water
You once poured into the wine cannot be
Drained off again.

What has happened has happened. The water
You once poured into the wine cannot be
Drained off again, but
Everything changes. You can make
A fresh start with your final breath.

Translated by John Willett


Be wary of evaluations of AI classifiers that focus only on predictive accuracy as a single percentage. Instead, as a bare minimum, we need to separate quantification of how well AI is doing when it says yes and when it says no.

Predictive accuracy is strongly determined by how common the true yes and no answers are. If a system is designed to try to diagnose a rare disease, the easiest way to increase accuracy is always to predict no, since that prediction will be correct for most cases.

Instead, we need to know how frequently is AI missing diseases that have a treatment and how often it is over-diagnosing and leading to potentially harmful overtreatment.

Another application of AI that we have all been using for decades is spam detection. In this domain, we need to know how often a spam filter is exposing us to cryptocurrency grifters (for example) and how often it is quietly deleting important emails.

AI systems allow a threshold to be tweaked that pushes them towards under- or over-diagnosis. How you choose the threshold depends on what action you will take as a result of the classification, e.g., will users just be exposed to a little extra spam that they can swiftly delete or will they be subjected to unnecessary chemotherapy.

Something other than gross self-indulgence

“More than anything, I’m excited. I’m excited to see how life is going to be different for the queer, trans, and even cis kids too, growing up in a world that has more language for gender variance. I’m excited to find out what sort of lives they will lead, from the genderqueer activists in the audience at my last reading to the barista with the orange mohawk who handed me the cup of tea I’m clutching for dear life as I write alone in this cafรฉ, trying to believe that writing this piece is something other than gross self-indulgence.

“The barista is wearing two name badges. One says their name; the other one says, in thick chalk capitals, I am not a girl. My pronouns are They/Them.”

– Laurie Penny (2015), How To Be A Genderqueer Feminist

When You Are Old, by William Butler Yeats

When you are old and grey and full of sleep,
And nodding by the fire, take down this book,
And slowly read, and dream of the soft look
Your eyes had once, and of their shadows deep;

How many loved your moments of glad grace,
And loved your beauty with love false or true,
But one man loved the pilgrim soul in you,
And loved the sorrows of your changing face;

And bending down beside the glowing bars,
Murmur, a little sadly, how Love fled
And paced upon the mountains overhead
And hid his face amid a crowd of stars.

On the term “randomista”

Sophie Webber and Carolyn Prouse (2018, p. 169, footnote 1) write:

Randomistas is a slang term used by critics to describe proponents of the RCT methodology. It is almost certainly a gendered, derogatory term intended to flippantly dismiss experimental economists and their success, particularly Esther Duflo, one of the most successful experts on randomization.” [Emphasis original.]


Sophie Webber and Carolyn Prouse (2018). The New Gold Standard: The Rise of Randomized Control Trials and Experimental Development. Economic Geography, 94(2), 166โ€“187.

UK Google searches for “conscription”

There seems to be more press coverage suggesting that conscription will be likely in the UK in the coming years. This was seemingly prompted by Dutch Admiral Rob Bauer’s intervention on 17 Jan 2024, which was followed by other comments, e.g., by Gen Sir Patrick Sanders on 23rd, a flutter of comment on Gen Z’s apparent reluctance to go to war, and a YouGov poll.

I was curious to discover what press coverage is doing to Google searches. Let’s have a look at Google Trends. First, zooming into the last seven days:

Searches began to pick up around 9pm on the 23rd, peaking 11pm on the 24th before fading out again:

Date Searches
2024-01-23T20 1
2024-01-23T21 2
2024-01-23T22 4
2024-01-23T23 6
2024-01-24T00 8
2024-01-24T01 8
2024-01-24T02 14
2024-01-24T03 25
2024-01-24T04 22
2024-01-24T05 28
2024-01-24T06 37
2024-01-24T07 43
2024-01-24T08 38
2024-01-24T09 35
2024-01-24T10 38
2024-01-24T11 46
2024-01-24T12 61
2024-01-24T13 71
2024-01-24T14 70
2024-01-24T15 69
2024-01-24T16 64
2024-01-24T17 75
2024-01-24T18 85
2024-01-24T19 82
2024-01-24T20 78
2024-01-24T21 72
2024-01-24T22 92
2024-01-24T23 100
2024-01-25T00 86
2024-01-25T01 81
2024-01-25T02 78
2024-01-25T03 71
2024-01-25T04 58
2024-01-25T05 57
2024-01-25T06 67
2024-01-25T07 75
2024-01-25T08 65
2024-01-25T09 55

That’s a notably bigger increase than over the past year:

Zooming out, this year’s spike is more pronounced than one in Feb 2022, coinciding with Russia’s full-scale invasion of Ukraine on the 24th.

Here’s all the data:

(Corrected final two graphs on 4 Feb 2024 to take account of all data in Jan 2024.)

Components and delivery formats of therapy for chronic insomnia

Interesting meta-analysis by Furukawa et al. (in press) of 241 trials, aiming to work out what components of therapy lead to better outcomes for people with chronic insomnia.


Furukawa Y., Sakata M., Yamamoto R., et al. Components and Delivery Formats of Cognitive Behavioral Therapy for Chronic Insomnia in Adults: A Systematic Review and Component Network Meta-Analysis. JAMA Psychiatry. Published online January 17, 2024.