The Science of Writing book series reveals writing strategies of bestselling professional authors, and teaches you how to build upon those skills in your own writing. The first book in the series, Final Edit, The Final Hours of Your Final Draft, was launched on October 22, 2011. Although the subject of this first book is self-editing, many of the editing decisions derive from proven scientific data obtained through computational linguistics research. This is intentional. The series will gradually introduce you to an approach to writing that is focused on shaping your content into the form that will best communicate to your readers. Each succeeding book in the series will disclose more of the results of twelve years of research into bestsellers, both fiction and nonfiction, using proprietary software—an expert system that has undergone six years of development beyond the version that we released at


He said, "said she."

I have been drawn into an increasing number of discussions about the acceptability of inverted attributions; for example, the inversion of “he said,” into “said he.” So much so, that I put together a little program to check out the matter and added it to FictionFixer. At the moment, it simply counts a close quote character followed by “said” versus the total occurrences of “said”; later, I will take direct versus indirect dialog into consideration. Here are some of the results, randomly chosen, rounded up to the nearest percent when possible.

Clive Barker, Coldheart Mountain: 19 out of 1525 = 1%
Dan Brown,
The Da Vinci Code: 2 out of 536 = 0.4%
Sandra Brown,
Heaven’s Price: 0 out of 291 = 0%
Sandra Brown,
The Witness: 0 out of 400 = 0%
Sandra Brown,
Exclusive: 9 out of 621 = 1%
Tom Clancy,
Patriot Games: 10 out of 694 = 1%
Tom Clancy,
Red Storm Rising: 25 out of 552 = 5%
Michael Crichton,
Timeline: 3 out of 1675 = 0.2%
Michael Crichton,
Jurassic Park: 8 out of 2038 = 0.4%
William Faulkner,
Sanctuary: 0 out of 1245 = 0%
John Grisham,
Testament: 4 out of 543 = 0.4%
John Grisham,
The Painted House: 11 out of 908 = 1%
Earnest Hemingway,
For Whom the Bell Tools: 132 out of 2644= 5%
Jack Higgins, Prayer for the Dying: 2 out of 591 = 0.3%
John Irving, A Widow for One Year: 0 out of 1018 = 0%
Stephen King,
The Shining: 1 out of 809 = 0.1%
Dean Koontz, What the Night Knows: 5 out of 377 = 1%
Dean Koontz,
Odd Thomas: 5 out of 378 = 1%
Barack Obama,
Dreams from my Father: 1 out of 637 = 0.2%
Nora Roberts, Dance of the Piper: 0 out of 90 = 0%
Sidney Sheldon,
Rage of Angels: 0 out of 615 = 0%

All results are 1% or less, with only two from this test group exceeding that limit.

My gut feeling was that this usage might be evolving, so I tested a couple older novels; they seemed to confirm this suspicion:

Charlotte Bronte, Jane Eyre: 161 out of 583 = about 28%
Jane Austin,
Pride and Prejudice: 241 out of 401 = 60%.

A similar feeling had me examine a couple novels written for the youth market:

C.S. Lewis, The Lion, the Witch, and the Wardrobe: 417 out of 535 = 78%
J.K. Rowling,
Harry Potter (book 1): 698 out of 794 = 88%

I tested the inverted “asked” form, too, but I will save those results for another post. Likewise, the contemporary usage rules gleaned from the initial 21 results. These are easy to spot when one examines the group of inverted forms as a whole.

Dialogue Can Support Character Gender

At a recent writers conference, I heard an interesting lecture about dialogue, including 12 tips for developing great dialog.

One thing was missing: assuring that dialogue an author creates contains words that might likely be said by someone of the gender of the person who is speaking.

Gender Genie, it is easy to determine whether a man or a woman produced a particular written text, and I have reason to believe that Moshe Koppel’s algorithm employed by Gender Genie serves dialogue just as well.

When Koppel’s algorithm appeared in 2003, I immediately incorporated it into one of
FictionFixer’s modules in order to determine the degree of maleness or femaleness of the bestsellers I had been data-mining. It has been part of the software ever since. The Science of Writing book series relies on FictionFixer (a corpus linguistics/computational stylistics program) to establish the techniques of bestselling authors revealed in current and forthcoming books in the series.

The algorithm performed flawlessly, with one exception: the first 12,000 words or so of Stephen King’s Carrie(1974).

In the early 2000s, it was difficult to obtain full texts of bestsellers without scanning and OCR’ing them, a thankless task I ended up doing repeatedly during the initial development of
FictionFixer. At that time, the only work of Stephen King’s I had in digital format was the opening of Carrie, and Moshe Koppel’s algorithm was identifying the author as distinctly female.

If you’ve read Carrie, you will recall that the opening section concerns the famous female locker room scene and includes an inordinate amount of female dialogue.

I concluded this to be proof that the Koppel’s algorithm could identify female dialogue—believable female dialogue, that is—even when the author is male. Naturally, other hypotheses crossed my mind, some of which involved Stephen King’s wife Tabitha…

Later, I read the following in Stephen King’s
On Writing, “The next night, when I came home from school, Tabby had the pages. She’d spied them while emptying my wastebasket, had shaken the cigarette ashes off the crumpled balls of paper, smoothed them out, and sat down to read them. She wanted me to go on with it, she said. She wanted to know the rest of the story. I told her I didn’t know…about writing high school girls. She said she’d help me with that part.”

A forthcoming book in the Science of Writing series includes much more about dialogue and gender, but in the meantime, you can paste your character’s dialogue into the
Gender Genie site and determine whether the words might have been spoken by a character of that gender. When doing so, it is best to isolate all the dialogue of an individual character into a single file and paste it into the Gender Genie en masse. If you find you need to tweak a character’s manner of speaking to increase his or her gender authenticity, the site provides a link to a paper co-authored by Moshe Koppel that explains the surprisingly simple determining factors.

Quick Reference for Final Edit Now Available!

Science of Writing is pleased to announce the availability of a Quick Reference for Final Edit (Word edition). This 8.5 by 11 inch (folded 17x11), four-page, glossy card-stock Quick Reference consolidates all the essentials of Final Edit: 6 important tables, 23 searches (out of 80 total), and 9 of the 13 word lists from the book. The price is $5.95.

More info in the sidebar at the right.

If you’re wearing your editor hat, you may have noticed the parenthetical “of 80 total” in the previous paragraph, and you may have wondered where the extra 30 searches came from; the only number on the cover of the book is “50.” The extra searches include 23 variations on the main searches and 7 final “Use Your Imagination” searches.

We offer special combination pricing for one copy of the book, Final Edit, The Final Hours of Your Final Draft (normally $15.95) PLUS the Quick Reference for the book (normally $5.95). For the special combo price of $18.95 use the lowest “Buy Now” button in the sidebar at the right.

Remember: Free Shipping Until May 31, 2012 June 15, 2013 !

Elements of Bestsellers (Patterns-01)

If you’ve read Final Edit, The Final Hours of Your Final Draft, you know that I advocate an approach to self-editing that includes making certain types of decisions and revisions after considering what bestselling authors have done at similar junctures. Future books in the Science of Writing Series will extend this type of decision-making to other stages of the writing process.

When you make an edit decision based upon a proven strategy evidenced in known successful works, you use a method employed by many artists, composers, authors, choreographers, and playwrights. Igor Stravinsky once said, “Good composers borrow; great composers steal,” but he was referring to something more akin to the quotation of identifiable material, rather than the application of a proven method or set of decision-making constraints.

Science of Writing books has been able to establish such decision-making constraints as a result of more than a decade of data-mining bestsellers with proprietary software, a small subset of which is accessible at

FictionFixer recognizes patterns on many levels of bestselling works, including levels that often go unmentioned and unthought-of. One purpose of this blog is to bridge the gap between using such information for final edit decisions, and applying similarly obtained principles to your writing at a far earlier stage than the final edit. With this in mind, the following is the first of several previews from a forthcoming book in the Science of Writing series (working title: The Form and Analysis of Writing).

Question: Do bestsellers exhibit structural patterns in the flow of dialog paragraphs versus non-dialog paragraphs?

This is one of the simplest patterns revealed by FictionFixer. The software encodes every paragraph of books using a proprietary functional coding that enables the detection of structural organization on many levels. At one stage of analysis, the software reduces paragraphs to a single letter: “D” or “N” for dialog or non-dialog. For books written in the first person “F” is added to identify between material written in the first person.

The dialog and non-dialog designations are further broken down into “d,” “D,” “n,” and “N,” the lowercase versions indicating paragraphs that consist of a single sentence.

Notice how the following patterns gradually phase in and out of the two main types of paragraphs draw from blockbuster bestsellers.

Paragraph patterns observed in the novel Testament, by John Grisham         Paragraph patterns observed in the novel Testament, by John Grisham

Paragraph patterns observed in the novel Testament, by John Grisham              

Both first-person dialog and non-dialog paragraphs gradually decrease as third-person dialog increases:

Paragraph patterns observed in the novel The Da Vinci Code, by Dan Brown

In the following example, we are distanced from the dialog by increasing numbers of non-dialog paragraphs, until suddenly, a one-sentence paragraph in first-person (green “f”) triggers a very striking plot-point.

Paragraph patterns observed in the novel Congo, by Michael Chrichton

This “dissipation” patterns often appears after the climax.

Paragraph patterns observed in the novel The Da Vinci Code, by Dan Brown

Automate Style Assignment During Scrivener to Word Conversion

[Excerpted from Final Edit, the Final Hours of Your Final Draft. This section “Tips for Scrivener Users” is from Chapter Eleven and first appeared in the second printing of the book.]

Scrivener is a program developed by Literature and Latte specifically for writers. See the links to various versions of the program in Appendix I “Link List”: links to the educational version at, the Macintosh version by way of, and the Windows version by way of

A common scenario among writers who use Scrivener is to write their books in Scrivener and then compile the manuscript into a file that can be brought into Word for its
Final Edit. That is precisely how this book was written.

Having just touched upon the subject of how to “Use Find and Replace to Automate Style Assignment,” I feel it is important to provide those readers who use Scrivener with some tips to leverage Scriveners “presets” so that the resulting compiled document can undergo automatic style assignment in Word.

The easiest way to do this is to make all your Scrivener presets unique in font size, font family, or both. Let’s use font size as an example. In Scrivener (before compiling) assign a unique size to some text in each preset and then use Format Formatting Redefine Preset From Selection to assign that size to all occurrences of that preset.

Having done this, now you can use Word’s Find & Replace box to convert everything of a specified font size to a designated Word style. Under the Find & Replace box’s “Format” menu choose “Font…” then, with the cursor in the empty “Find what” field, specify the font size of a particular preset. Move the cursor to the “Replace with” field and choose the appropriate style from the “Style…” item at the bottom of the “Format” menu. Again, the field will be empty. Now press the “Replace All” button and everything of the chosen font size will be assigned to the same Word style.

It’s a good idea keep the same font size that was used to identify the Scrivener preset as the font size of the destination Word style. After you have converted all your Scrivener presets to Word styles in this manner, it’s a simple task to change the font sizes of the Word styles to whatever you prefer