16 December 2009

Physicists uncover authors’ literary fingerprint

Cosmos Online
Swedish physicists have developed a formula to identify individual writers by a unique 'literary fingerprint', which could help to prevent literary fraud and identify unknown authors.
Pile of books

Authors subconsciously write in a mathematical pattern can be as distinctive as a fingerprint, Swedish physicists say.

SYDNEY: Swedish physicists have developed a formula to identify individual writers by a unique ‘literary fingerprint’, which could help to prevent literary fraud and identify unknown authors.

Using the works of Thomas Hardy, D.H. Lawrence and Herman Melville, the researchers, from Umeå University in Sweden, measured the rate with which authors introduce new words to their writing.

They found that it decreased as the text-length increased. They also found that the resulting curve of this increase was clearly unique for each author, and remained so regardless of the section of text used or its length.

Statistics of words used

“Our results tell us that earlier attempts to explain the word-frequency in a book … were wrong in principle. The statistics of the words, on the most basic level, do not depend on what you have written earlier in the book, it is a property independent of which page of the book you are writing,” said Sebastian Bernhardsson, physicist and lead author of the paper, published in the New Journal of Physics.

The researchers also found that the apparently universal and empirical law of word frequency proposed by George Kingsley Zipf over 75 years ago is not as universal as once thought.

Instead, a writer’s style and word-use depends more on text length and linguistic ability than on any over-arching rule.

These findings led the authors to consider the concept of a ‘meta book’ – a hypothetical ‘mother’ book which would provide an abstract representation of the ‘word-frequency curve’ characteristic of a particular author.

Subconscious ‘meta-book’

“Whenever writing a text, an author is effectively pulling a piece of text out of this meta book,” explained Bernhardsson.

“This is a new concept and we believe it raises interesting questions about how we express ourselves in writing, and how we access our verbal minds, on a very basic level,” he added. “What makes our mental pipeline always produce a flow of words which, although at each time entirely specific, seem to follow totally time-independent statistics?”

Tim Sinclair, a published author and the communications officer at the Australian Society of Authors in Sydney, was, however, a little sceptical about need for such an overlap of physics and the arts.

“While this is an impressive technical achievement, and an appealing concept – the idea of a ‘meta-book’ ready and waiting to be accessed by the author over the course of their lifetime ties in nicely with the intangible concept of the Muse – it does seem a little redundant,” he said.

Computer program not needed

“Any good reader – and good readers are less common than people imagine – can tell if they are reading an Austen or a Hardy, a Winton or a Garner. Authors’ fingerprints are all over their work, and it doesn’t take a computer program to dust for them.

“Just look at how hard it is for previously-published authors to remain anonymous after they publish under a pseudonym,” he said, giving as an example The Bride Stripped Bare, a book supposedly written anonymously by a frustrated housewife which was later outed as the work of Australian author Nikki Gemmell.


Sign up to our free newsletter and have "This Week in Cosmos" delivered to your inbox every Monday.

>> More information
Like us on Facebook
Follow @CosmosMagazine
Add Cosmos to your Google+ circles

Get a weekly dose of Cosmos delivered straight to your inbox!

  • The latest in science each week
  • All the updates on our new website launch
  • Exclusive offers and competitions

Enter your name and email address below: