Beyond the surface: stylometric analysis of GPT-4o’s capacity for literary style imitation DOI Creative Commons
George Mikros

Digital Scholarship in the Humanities, Journal Year: 2025, Volume and Issue: unknown

Published: April 23, 2025

Abstract This study aims to explore the ability of GPT-4o imitate literary style renowned authors. Ernest Hemingway and Mary Shelley were selected due their contrasting styles overall impact on world literature. Using three distinct prompting strategies—zero-shot generation, zero-shot imitation, in-context learning—we generated forty-five stylistic imitations analyzed them alongside authors’ original texts. To ensure thematic consistency, we constrained texts shared narrative themes derived from works. We used a distance-based approach authorship attribution using 1,000 most frequent words cosine distance how large language model’s positioned in multidimensional space. Moreover, exploited random forest classifier repeated task analyze distinctiveness GPT further. combination Textual Complexity Readability, Author Multilevel N-gram Profiles, Word Embeddings, Linguistic Inquiry Count features. t-SNE visualizations further evaluated alignment between GPT-generated The findings reveal that while captures some surface-level elements authors, it struggles fully replicate depth uniqueness stylometric signatures. Imitations via learning showed improved with authors but still exhibited significant overlap generic outputs.

Language: Английский

Beyond the surface: stylometric analysis of GPT-4o’s capacity for literary style imitation DOI Creative Commons
George Mikros

Digital Scholarship in the Humanities, Journal Year: 2025, Volume and Issue: unknown

Published: April 23, 2025

Abstract This study aims to explore the ability of GPT-4o imitate literary style renowned authors. Ernest Hemingway and Mary Shelley were selected due their contrasting styles overall impact on world literature. Using three distinct prompting strategies—zero-shot generation, zero-shot imitation, in-context learning—we generated forty-five stylistic imitations analyzed them alongside authors’ original texts. To ensure thematic consistency, we constrained texts shared narrative themes derived from works. We used a distance-based approach authorship attribution using 1,000 most frequent words cosine distance how large language model’s positioned in multidimensional space. Moreover, exploited random forest classifier repeated task analyze distinctiveness GPT further. combination Textual Complexity Readability, Author Multilevel N-gram Profiles, Word Embeddings, Linguistic Inquiry Count features. t-SNE visualizations further evaluated alignment between GPT-generated The findings reveal that while captures some surface-level elements authors, it struggles fully replicate depth uniqueness stylometric signatures. Imitations via learning showed improved with authors but still exhibited significant overlap generic outputs.

Language: Английский

Citations

0