The Science of Writing book series reveals writing strategies of bestselling professional authors, and teaches you how to build upon those skills in your own writing. The first book in the series, Final Edit, The Final Hours of Your Final Draft, was launched on October 22, 2011. Although the subject of this first book is self-editing, many of the editing decisions derive from proven scientific data obtained through computational linguistics research. This is intentional. The series will gradually introduce you to an approach to writing that is focused on shaping your content into the form that will best communicate to your readers. Each succeeding book in the series will disclose more of the results of twelve years of research into bestsellers, both fiction and nonfiction, using proprietary software—an expert system that has undergone six years of development beyond the version that we released at


He said, "said she."

I have been drawn into an increasing number of discussions about the acceptability of inverted attributions; for example, the inversion of “he said,” into “said he.” So much so, that I put together a little program to check how the matter and added it to FictionFixer. At the moment, it simply counts a close quote character followed by “said” versus the total occurrences of “said”; later, I will take direct versus indirect dialog into consideration. Here are some of the results, randomly chosen, rounded up to the nearest percent when possible.

Clive Barker, Coldheart Mountain: 19 out of 1525 = 1%
Dan Brown,
The Da Vinci Code: 2 out of 536 = 0.4%
Sandra Brown,
Heaven’s Price: 0 out of 291 = 0%
Sandra Brown,
The Witness: 0 out of 400 = 0%
Sandra Brown,
Exclusive: 9 out of 621 = 1%
Tom Clancy,
Patriot Games: 10 out of 694 = 1%
Tom Clancy,
Red Storm Rising: 25 out of 552 = 5%
Michael Crichton,
Timeline: 3 out of 1675 = 0.2%
Michael Crichton,
Jurassic Park: 8 out of 2038 = 0.4%
William Faulkner,
Sanctuary: 0 out of 1245 = 0%
John Grisham,
Testament: 4 out of 543 = 0.4%
John Grisham,
The Painted House: 11 out of 908 = 1%
Earnest Hemingway,
For Whom the Bell Tools: 132 out of 2644= 5%
Jack Higgins, Prayer for the Dying: 2 out of 591 = 0.3%
John Irving, A Widow for One Year: 0 out of 1018 = 0%
Stephen King,
The Shining: 1 out of 809 = 0.1%
Dean Koontz, What the Night Knows: 5 out of 377 = 1%
Dean Koontz,
Odd Thomas: 5 out of 378 = 1%
Barack Obama,
Dreams from my Father: 1 out of 637 = 0.2%
Nora Roberts, Dance of the Piper: 0 out of 90 = 0%
Sidney Sheldon,
Rage of Angels: 0 out of 615 = 0%

All results are 1% or less, with only two from this test group exceeding that limit.

My gut feeling was that this usage might be evolving, so I tested a couple older novels; they seemed to confirm this suspicion:

Charlotte Bronte, Jane Eyre: 161 out of 583 = about 28%
Jane Austin,
Pride and Prejudice: 241 out of 401 = 60%.

A similar feeling had me examine a couple novels written for the youth market:

C.S. Lewis, The Lion, the Witch, and the Wardrobe: 417 out of 535 = 78%
J.K. Rowling,
Harry Potter (book 1): 698 out of 794 = 88%

I tested the inverted “asked” form, too, but I will save those results for another post. Likewise, the contemporary usage rules gleaned from the initial 21 results. These are easy to spot when one examines the group of inverted forms as a whole.

Dialogue Can Support Character Gender

At a recent writers conference, I heard an interesting lecture about dialogue, including 12 tips for developing great dialog.

One thing was missing: assuring that dialogue an author creates contains words that might likely be said by someone of the gender of the person who is speaking.

Gender Genie, it is easy to determine whether a man or a woman produced a particular written text, and I have reason to believe that Moshe Koppel’s algorithm employed by Gender Genie serves dialogue just as well.

When Koppel’s algorithm appeared in 2003, I immediately incorporated it into one of
FictionFixer’s modules in order to determine the degree of maleness or femaleness of the bestsellers I had been data-mining. It has been part of the software ever since. The Science of Writing book series relies on FictionFixer (a corpus linguistics/computational stylistics program) to establish the techniques of bestselling authors revealed in current and forthcoming books in the series.

The algorithm performed flawlessly, with one exception: the first 12,000 words or so of Stephen King’s Carrie(1974).

In the early 2000s, it was difficult to obtain full texts of bestsellers without scanning and OCR’ing them, a thankless task I ended up doing repeatedly during the initial development of
FictionFixer. At that time, the only work of Stephen King’s I had in digital format was the opening of Carrie, and Moshe Koppel’s algorithm was identifying the author as distinctly female.

If you’ve read Carrie, you will recall that the opening section concerns the famous female locker room scene and includes an inordinate amount of female dialogue.

I concluded this to be proof that the Koppel’s algorithm could identify female dialogue—believable female dialogue, that is—even when the author is male. Naturally, other hypotheses crossed my mind, some of which involved Stephen King’s wife Tabitha…

Later, I read the following in Stephen King’s
On Writing, “The next night, when I came home from school, Tabby had the pages. She’d spied them while emptying my wastebasket, had shaken the cigarette ashes off the crumpled balls of paper, smoothed them out, and sat down to read them. She wanted me to go on with it, she said. She wanted to know the rest of the story. I told her I didn’t know…about writing high school girls. She said she’d help me with that part.”

A forthcoming book in the Science of Writing series includes much more about dialogue and gender, but in the meantime, you can paste your character’s dialogue into the
Gender Genie site and determine whether the words might have been spoken by a character of that gender. When doing so, it is best to isolate all the dialogue of an individual character into a single file and paste it into the Gender Genie en masse. If you find you need to tweak a character’s manner of speaking to increase his or her gender authenticity, the site provides a link to a paper co-authored by Moshe Koppel that explains the surprisingly simple determining factors.

Dialogue Attribution Verbs—Myths and Realities

[Excerpted from Final Edit, the Final Hours of Your Final Draft.]

A popular misconception among writers is that one should avoid “said” as much as possible and, in its place, substitute any sound-related term imaginable, the more obscure, the more desirable. I have seen novels in which the author did not use “said” even once because of this misconception, to the detriment of the work. Because I work in the field of computational linguistics using the corpus of bestselling novels for my research, I can state with certainty that this does not occur in books by bestselling authors.

To promote either of these misconceptions is tantamount to a music teacher promoting the notion that a performance should include wrong notes “in order to distinguish one’s performance from that of others performing the same work,” and the more wrong notes, the merrier!

The truth of the matter is that 80% or more of your attributions should use the verbs “said” or “asked.” “Said” is always
invisible to the reader and “asked” shares that characteristic in most cases. The remaining 20% of attributions should be distributed among words that can actually refer to human speech! In other words, you cannot say, “she grinned” or “she grimaced” in place of “she said” because grinning and grimacing cannot produce human words.

Use “said” more than all the other attributions combined: at least twice as often as “asked” (2.33 times as many is a good ratio). The complete list of acceptable attributions follows, in order of preference by frequency of usage by bestselling authors (determined by FictionFixer).

NORMAL: said, asked

EXOTIC: insisted, shouted, answered, whispered, gasped, explained, demanded, cried, responded, lied, observed, murmured, stuttered, mumbled, snarled, screamed, protested, muttered, hissed, yelled, replied, groaned, begged, added, declared, confessed, railed, pleaded, conceded, whined, pointed out, and “signed” (if a character uses sign language)

You will notice that every verb on the list can actually refer to the act of human speech. In this text, we will use the term “exotic” to refer to all attribution verbs that are not “said” or “asked.”

You may be wondering why “thought” is not on this list. Do not consider “thought” to be a dialogue attribution verb, rather, it is a thought attribution verb (like pondered, contemplated, reflected, speculated, and imagined). Thoughts may, in rare instances, require attributions, but the thoughts, themselves, are usually not placed in quotes. Thoughts may be in italics or not, but italicized “thoughts” do not require an attribution verb.


Here is the missing part of the puzzle. Attributions should be required in only 25% of your dialogue blocks. Note the term “dialogue blocks” in the previous sentence does not mean dialogue sentences. A dialogue block represents any number of dialogue sentences, as little as one, or as many as you like.

Professional writers manage to leave about 75% of all dialogue blocks without attributions because they take great care in structuring their dialogue so the reader always knows who is speaking. You can accomplish this by indicating the speaker in the first two dialogue paragraphs, and then letting the reader assume that the two speakers alternate until another attribution indicates otherwise or until non-dialogue text is encountered.


We know that the best ratio between normal attribution verbs and exotic attribution verbs is 80% to 20%. So, if only around 25% of our dialogue blocks should have attributions, and 80% of those should be normal attributions, then 20% of your dialogue blocks should have the normal attribution verbs, “said” and “asked.” The remaining approximately 5% of your attribution verbs can be drawn from the exotic pool.

Bookmark and Share