“Here we found that crowd workers on MTurk widely use LLMs in a
summarization task, which raises serious concerns about the gradual dilution of the “human factor” in crowdsourced text data.”
Veselovsky, V., Ribeiro, M. H., & West, R. (2023). Artificial artificial artificial intelligence: Crowd workers widely use large language models for text production tasks.
LLM-driven text analysis is becoming a norm, allowing people to process huge volumes of text they wouldn’t otherwise have the capacity to do. Although outputs can be checked, the large volume of inputs processed means there are fundamental limits on how comprehensively analyses can be checked.
PwC announced yesterday that it is trialling the use of Harvey, built on Chat GPT, to “help generate insights and recommendations based on large volumes of data, delivering richer information that will enable PwC professionals to identify solutions faster.”
They say that “All outputs will be overseen and reviewed by PwC professionals.” But what about how the data was processed in the first place…?