The Author Earnings team are attempting to do something which hasn’t been done before, and their work can’t be refined and improved unless there is some intelligent criticism of their approach and findings.
Today I’ve invited Phoenix Sullivan to blog on the topic. I’ve known Phoenix for a few years now, and if there’s a smarter person in publishing, I haven’t heard of them.
KBoards regulars will already know that Phoenix understands the inner workings of the Kindle Store better than anyone outside Amazon.
. . . .
I set aside some time recently to dive into the Author Earnings raw data for the May 1, 2015 Report. The irksome thing about the scraped data is how much of the puzzle that is Amazon’s ebook sales is missing and/or open to interpretative analysis. It isn’t the data’s fault or even the fault of the collection method. It’s simply that the data made public is limited, which in turn means a lot of creative interpretation goes into even so simple a task as coming up with the number of ebooks sold in a day. While the raw data itself isn’t changeable, different tools and assumptions applied to the data can yield different results, thereby opening up the analysis to differing interpretations.
My goal was to apply a set of tools and assumptions that update and possibly correct those being used by the Author Earnings team. The environment has changed dramatically in the 15 months since the first report came out, yet the analytical tools, in my opinion, haven’t necessarily kept up with the times. That in itself does not mean the results are wrong, but without a challenge to them, we’ll never know, right?
. . . .
By far the biggest assumptive correction I’ve made is two-fold: The first part is applying a new set of sales:rank calculations to the dataset and the second part is applying calculations to maintain ranks rather than using the multipliers needed to hit a rank. Let’s be clear that these multipliers are observed only, and best guesses across a lot of observations. However, I do believe the multipliers currently being used by AE are 1) outdated, and 2) don’t reflect the actual number of sales happening for the majority of books that are maintaining rank in the store and not seeing huge rank swings on a day-to-day basis.
. . . .
Amazon’s algorithms take historical sales – among other variables, such as velocity – into consideration when calculating rank. The longer a title remains around a given rank, the fewer sales it takes to maintain that rank. Observably, anywhere from 10-50% fewer sales. That means the multipliers for hitting ranks are not good indicators of unit sales numbers for the majority of books in the dataset. Here is my observed chart for average sales to maintain rank, along with the old and new numbers for hitting rank. More work needs to be done to fill in the upper brackets on the maintain side. I used the same numbers from my Sales to Hit chart when I felt I didn’t have enough data points on the Maintain side to chart new numbers in, but the safe assertion is that the Top 500 in my own data is over-reporting by a conservative 10%.
. . . .
Integrating KU into the reporting back in July dialed the difficulty of analyzing the data up into the stratosphere. Unread – and therefore unpaid – borrows influence rank across all titles. There’s no way to know how many borrows eventually become paid reads. And there’s no way to calculate how many units moved on any given title were at full price and how many were borrows, either paid or unpaid. Self-reported numbers suggest the split of paid sales to paid borrows is about 50:50 (which still doesn’t account for the unread borrows that inflate rank), which is what the AE Reports use as well. Using the Maintain chart above, I rejiggered all the numbers. The adjusted royalties may well still be inflated, but are, I think, a closer approximation. The difference for the dataset is a statistically significant 21.4% spread in dollars (or the $400 million difference between $1.81 and $1.42 billion per year):
- $4,957,365 – original AE result for all earnings
- $4,848,116 – AE results with the new modeling applied
- $3,895,691 – my adjusted estimate
and for the KU amounts specifically:
- $167,687 – AE results for borrows with the new modeling applied
- $144,201 – my estimate
- 252,161 – AE estimate for total number of KU units sold/borrowed using the Maintain calculations for Indies + Uncategorized
- 216,410 – my estimate
. . . .
Since the AE Report looks at aggregated totals over individual sales and positions itself as one factor for authors to consider when deciding which path to publishing to pursue, I decided to see what each book averaged in each publishing path. There are pie charts below, but let’s also use words to be sure the picture is clear either way it’s expressed. If we look at gross sales, we see that the Big 5 had only about 50% of the number of titles available in the dataset than indies had. Big 5 books sold about 78% of the number of books indies sold and made more than twice as much. A lot of that goes into Publisher and Amazon pockets, but what does that really mean? The charts show that indie authors in aggregate earned about 25% more than Big 5 authors. In other words, it took almost 50% more available indie books to earn their authors 25% more than Big 5 authors.
. . . .
From the above, we can say that while market share may have eroded for the Big 5, gross sales plateau’d between Jan and May. Losing market share is not the same as bleeding money. Besides, the ebook market – discrete from the general publishing market – is relatively new. The Big 5 were never part of that market until it became lucrative enough to play in, and only once indies were invited into the market did it start to burgeon. Notbecause of indies, but the timing is inseparable. Big 5 never dominated the market, and a few deviation points here and there doesn’t mean it’s losing the market. And while percentage charts are pretty to look at, they don’t always describe an accurate picture. Ebooks, for instance, have lured a certain percentage of customers away from the used-books market. The Big 5 were not in the used-book market before and their models don’t include that market now.