Links to Analysis Regarding

Because I am, apparently, incapable of keeping my mouth shut when it comes to certain things.

Like math.

Like bad math.

Like people using bad math to support their pet Cause when the data do not support those conclusions.

If you don’t know what I’m talking about: self-publishing evangelist Hugh Howey and a silent partner went and scraped a bunch of Amazon data.  That’s fine.  That could be cool, even.  But then they made a bunch of pretty charts and used it to bang their pro-self-publishing / anti-trade publishing drum, and wrote a whole lot of paragraphs next to the pretty charts as if they were Conclusions, when, in fact, those paragraphs were not in any way implied by the data collected.

This pains me in my mathematician heart.  And it makes me angry when people misinform aspiring authors this way.  Mr. Howey touts himself as an author advocate, but that’s not what this is.  These data do not support his “conclusions.”  To be fair, they don’t disprove his ideas, either; they just don’t really say much of anything.  And when Howey pretends that they do support him, he’s giving authors bad information.

I’m not saying all this because I’m anti-self-publishing (I’m not!  I’m doing it myself, in fact!).  But science isn’t about “sides.”  When talking about science or math, there shouldn’t be sides; there’s no “teach the controversy” or “we’ll let the people who believe Earth is flat have equal air time.”  Or there shouldn’t be.  There’s just what the data imply, and what they don’t.  And there’s absolutely no shame in saying, “I firmly believe in XYZ.  And I just collected a lot of data in the field . . . but unfortunately those data don’t support XYZ.  They don’t contradict it, either, but there are just too many limitations here, and too much we don’t know.  That said, I still believe my ideas on XYZ are right and that the data will bear them out eventually!”

There’s no shame in that.

But that’s not what Howey did.  He used the numbers to pretty up a dog-and-pony show that pretends to support his preconceived notions with data, and he posted a piece that is actively detrimental to anyone trying to cut through the obfuscation and agendas and learn about publishing.

Now, who wants MATH?  Have some links![1]

How (Not) to Lie With Statistics.  “[The authors of the report] make claims that the data cannot possibly support […] they do a lot of inferring that is analytically indefensible.” (emphasis in the original)  I highly suggest reading the whole thing.  It’s a very detailed and well-written analysis by someone trained in research and sociological methods, and it concludes, as I did, that these data do not imply anything like what Howey claims.

Some Thoughts on Author Earnings. “The failure to compare the model’s results to actual measurements before making pronouncements is a huge problem.”  Courtney Milan is an extremely successful self-publisher, so obviously she’s pro-self-publishing.  She’s also clearly incredibly knowledgeable about data analysis, and she points out a myriad of problems with the way these findings are presented, as well as also some possible discrepancies in the raw data.

The Missionary Impulse. “Sorry, Hugh.  There is absolutely nothing in your blog post that justifies that conclusion.  This is not the same as saying that your conclusion is wrong.  Maybe it’s right.  But if it’s right, it’s not because of anything — anything! — in your blog post.”  This makes many excellent points and comes with a context of a lot of details of the publishing industry (the author is a literary agent).  Once again, the conclusion is that the data do not actually allow Howey to make any of the extreme claims he’s making.

Digital Book World: Analyzing the Author Earnings Data Using Basic Analytics.  “For myself and others, I wish I had more optimistic findings that showed we could all share in an incredible gold rush, but the data are the data.”  This article makes a case that the data are actually entirely consistent with the site’s own (far more pessimistic) prior survey, and can’t be used to prove anything more extreme.  (Obviously it’s possible there’s a bias there, and I can’t comment on the DBW survey as I haven’t seen the full thing, but I think what’s said here is valuable and knowledgeable regardless, and I note that the author is exceptionally qualified at data analysis.)

Some Quick Thoughts On That Report on Author Earnings. “[W]hile the report gives the illusion of providing hard data, it appears to be as built on guesswork as anything else we’ve had.”  Steve Mosby also makes excellent points about the unique path a published book takes, and that this can’t be repeated with hindsight.

Edited to add: Comparing self-publishing to being published is tricky and most of the data you need to do it right is not available “Unfortunately, Hugh’s latest business inspiration — a call to arms suggesting to independent authors that they should just eschew traditional publishing or demand it pay them like indie publishing — is potentially much more toxic to consume.” Mike Shatzkin weighs in with a long list of other variables Howey’s report does not take into account.


Look, you can’t list a lot of numbers and a lot of pretty charts and then list “conclusions” next to them and say one follows from the other because they happen to be next to each other on the page.  Science doesn’t work that way.

The poor way these data have been presented only serves to feed the adversarial “us vs. them” mentality that (some) self-publishers and (some) trade published writers are for some strange reason so invested in.  Personally, I want to see that attitude go away forever.  It’s not productive.  It’s not helpful.  I wish to all that is holy that Howey had come out with this spreadsheet in a more professional way, an invitation to other people in the writing/publishing world to analyze the data and see what we might be able to learn.  That might’ve been nifty, a positive addition to the knowledge base.  Instead, by presenting it as part of such a massive load of bad math and misinformation, he’s only clouded the discussion even more.

That’s not good for anyone.  And speaking as a self-publisher, it embarrasses me.  False conclusions that are unsupported by data, written up in something that pretends to be a study but is anything but—it just looks desperate.  Self-publishing is all grown up now, and the people most responsible for stigmatizing us in the eyes of other writers and publishers are the self-publishers themselves who pull stunts like this one.


Comments are closed, as I don’t have time to babysit the blog right now and from what I’m seeing elsewhere this subject can be rather contentious.  I may reopen them later.  If you have something you feel would be a valuable addition to this post, feel free to send me the comment through the Contact page and I will post it here.  Be warned that I am only going to be prone to posting contributions of the dry academic variety on this one.

  1. Note that this list is, in order, a researcher who doesn’t write fiction, a successful self-publisher, a literary agent, a data analytics professional whose research is in digitization, and a trade published writer.  And I’m a math nerd who is self-publishing my fiction books.  The biases we’d be expected to have are all over the map, but like I said, science doesn’t take sides.