In a recent study published in the journal Scientific Reports, researchers compared story retelling between humans and the large language model (LLM), chat generative pre-trained transformer (ChatGPT).
Humans can communicate past, projected, fictitious, or distant events to one another through short narratives and stories. Narrative communication allows people absent from the event to experience how the event would feel. Retelling events and stories is intrinsically connected to human civilization and culture.
Notably, a specific feature of the communication of narratives is that audiences first experience the story and emotional flow before any subsequent retelling or recall. This means the temporal order experienced is similar to individual experience in a stepwise process. In retelling or recalling the story, people recall the original wording and personal experiences. This aspect of human retelling may differ from LLMs like ChatGPT.
Study: Humans create more novelty than ChatGPT when asked to retell a story. Image Credit: Rob Hyrons / Shutterstock
About the study
In the present study, researchers compared story retelling by humans and ChatGPT. Human participants from the Amazon Mechanical Turk were instructed to write a sad, mildly sad, happy, or mildly happy story, avoiding explicit emotion words (e.g., sad or happy). Subsequently, 348 individuals were asked to retell them.
They were given at least 40 seconds to read and instructed to retell the story with ≥ 60 characters in at least 30 seconds. Notably, there were no explicit instructions to focus on the story's affect, emotions, and other aspects. In addition, the team used identical starting stories for ChatGPT and asked it to retell them. ChatGPT received instructions for retelling similar to those of the human participants.
Retellings were produced using different user accounts to ensure the original story and other versions were inaccessible to ChatGPT. Further, additional participants from Prolific were recruited to rate stories from one retelling iteration. The team analyzed word count, parts of speech, age of acquisition, affect, and concepts.
Findings
Human and ChatGPT retellings were shorter than the original story, but key elements were retained in the retellings. Nevertheless, human and ChatGPT retellings followed different patterns. Of note, ChatGPT retellings were markedly similar., whereas human retellings morphed across iterations. Each human retelling offered a new, different interpretation of the previous.
While humans and the chatbot produced substantially shortened stories, the first ChatGPT retelling was significantly shorter than the first human retelling. Regression analyses indicated that humans and ChatGPT significantly reduced word count over iterations. The variability in word count was significantly larger for humans compared to ChatGPT. The reduction in word count was significantly shallower across retellings for ChatGPT.
Regarding the parts of speech, while minor changes were evident over iterations, the differences between humans and ChatGPT were apparent. Specifically, humans used more verbs, pronouns, and adverbs than ChatGPT and twice as many negations as the chatbot. In contrast, ChatGPT used more adjectives and nouns than humans did. Humans' relatively higher usage of adverbs and verbs suggested that people focused on emotions and actions.
In contrast, the chatbot's increased use of adjectives and nouns indicated its focus on objects and entities. Human retellings had a higher rate of negations than ChatGPT retellings despite negations requiring more cognitive effort. ChatGPT used fewer synsets than humans in the first retelling. It had a higher synset density than humans, and synsets were more likely to persist across retellings in ChatGPT. By contrast, humans created more novel synsets than the chatbot.
Conclusions
Humans and ChatGPT exhibited distinct characteristics in story retelling. Human retellings were more creative and less preserved over iterations than ChatGPT. While the chatbot produced a substantially shortened, concise version of the original story, it made few changes over subsequent iterations, maintaining the core of the previous retelling.
Moreover, ChatGPT used fewer negations than humans. The differences in preference for parts of speech suggested that narrative and non-narrative texts might be distinct for humans but not ChatGPT. Notably, ChatGPT displayed remarkable emotional stability, emphasizing the core situation, and after the first retelling, it mainly replaced synonyms in subsequent iterations.
Meanwhile, humans accumulated changes over successive retellings, and stories in human retellings changed between iterations with omissions, inventions, and adaptations. Nevertheless, the emotional core of stories was also preserved by humans. Overall, the findings reveal that human retelling involves creativity, anchored by emotions, beyond ChatGPT's default probabilistic wording.