New Author Earnings Report

From Dean Wesley Smith:

It’s got a ton of stuff in it. And takes a vast amount of time to go through. But worth it.

And their new side-business they are only offering to big publishers is flat scary.

Horrifying, actually if they are doing it wrong.

And I got a hunch that unless they are pulling names, that new business is setting them up for more lawsuits than I want to think about. Because from my understanding, they are releasing personal sales numbers of writers to businesses who can get past their paywall.

. . . .

Data Guy, Hugh, tell me I am wrong here… Please?

I know in the free report you released some names and blocked some information and other names. I hope that every bit of data you release attached to a name is permission granted. Please, please tell me you are doing that…

Because behind that stupid paywall of needing ten million in sales, any of my pen names, my name, Kris’s name, or our numbers better not be out in public there. And how will I know? Let me think… I have been around this industry for forty years and have a lot of friends who will be glad to send me information about myself they buy from you.

Link to the rest at Dean Wesley Smith

PG is an admirer of Dean. And Dataguy and Hugh.

Each of them has contributed a great deal of information (without compensation) that has benefited all authors and indie authors in particular.

PG is a big believer in personal and business privacy for those who don’t affirmatively spread their personal business everywhere.

However, the first question which came to PG’s mind after reading Dean’s comments, having earlier seen the information Dataguy made available from his new data analysis.

Who owns the information about how many books Dean (or any other author) sells and the prices at which those books are sold and the royalty rates Amazon and others pay to authors who self-publish with Amazon, Kobo, etc.?

Does Dean own the information because he wrote the books and owns the copyrights?

Does Amazon own the information because it reflects the number of books it sells on its own website?

Does Dataguy own the information because he gathers and aggregates it in ways that Amazon/Kobo, etc., don’t?

PG took a quick blast through the KDP Terms & Conditions which Dean and any other author who sells via KDP has agreed to. He did not find anything in the Ts&Cs that directly addresses the question of who owns the data respecting the sales and pricing of ebooks that Amazon sells. (PG suspects we can look for some clarifications on this topic in a future edition of the Ts&Cs)

PG did find some terms that tangentially address data ownership, however. (This is from the version last updated on   September 1, 2016)

Here’s a first section – Customer Prices:

5.3.4 Customer Prices. To the extent not prohibited by applicable laws, we have sole and complete discretion to set the retail customer price at which your Digital Books are sold through the Program. We are solely responsible for processing payments, payment collection, requests for refunds and related customer service, and will have sole ownership and control of all data obtained from customers and prospective customers in connection with the Program.

The question here is that, if Amazon has sole ownership and control of “all data” obtained from customers “in connection with the Program,” what things do “all data” cover?

The prices the customers paid for ebooks and the numbers of ebooks they purchased would seem to be part of the data obtained from customers in connection to purchases of books offered for sale via the KDP program. If so, the author doesn’t own that data.

Is there an implied license permitting Amazon to aggregate and categorize this data? Those activities, although not expressly mentioned, are required if Amazon is to calculate and create royalty reports for the purposes of paying authors. Providing authors access to this type of information would also seem to be required under Paragraph 5.4.2 which says Amazon will “will make available to you an online report detailing sales of Digital Books and corresponding Royalties. ”

Since this is so exciting, let’s move on to 5.5 Grant of Rights.

You grant to each Amazon party, throughout the term of this Agreement, a nonexclusive, irrevocable, right and license to distribute Digital Books, directly and through third-party distributors,

. . . .

(e) use, reproduce, adapt, modify, and distribute, as we determine appropriate, in our sole discretion, any metadata that you provide in connection with Digital Books;

. . . .

In addition, you agree that we may permit our affiliates and independent contractors, and our affiliates’ independent contractors, to exercise the rights that you grant to us in this Agreement. “Amazon Properties” means any web site, application or online point of presence, on any platform, that is owned or operated by or under license by Amazon or co-branded with Amazon, and any web site, application, device or online point of presence through which any Amazon Properties or products available for sale on them are syndicated, offered, merchandised, advertised or described.

Metadata is not expressly defined in the KDP Ts&Cs. Paragraph 5.1.2. requires the author to make certain she/he provides correct metadata. Under Subparagraph (e) of 5.5 above the author permits Amazon the right to use, reproduce, adapt, modify, and distribute . . . any metadata it receives from the author.

Is pricing of the book metadata?

One definition of metadata outside of the KDP docs is “a set of data that describes and gives information about other data.”

PG has a hard time seeing that the price of the book is not metadata – it describes and gives information about what the royalty rate will be under KDP and provides a basis upon which the mathematical calculation of the total royalty payable to the author will be calculated.

If pricing is metadata, Amazon can do almost anything it wants to do with pricing information, including distributing it to others besides the author.

One last KDP paragraph:

7 Confidentiality. You will not, without our express, prior written permission: (a) issue any press release or make any other public disclosures regarding this Agreement or its terms; (b) disclose Amazon Confidential Information (as defined below) to any third party or to any employee other than an employee who needs to know the information; or (c) use Amazon Confidential Information for any purpose other than the performance of this Agreement.

. . . .

“Amazon Confidential Information” means (1) any information regarding Amazon, its affiliates, and their businesses, including, without limitation, information relating to our technology, customers, business plans, promotional and marketing activities, finances and other business affairs, (2) the nature, content and existence of any communications between you and us, and (3) any sales data relating to the sale of Digital Books or other information we provide or make available to you in connection with the Program. Amazon Confidential Information does not include information that (A) is or becomes publicly available without breach of this Agreement, (B) you can show by documentation to have been known to you at the time you receive it from us, (C) you receive from a third party who did not acquire or disclose such information by a wrongful or tortious act, or (D) you can show by documentation that you have independently developed without reference to any Amazon Confidential Information.

PG didn’t find a specific provision where the author agreed that Amazon Confidential Information is the sole property of Amazon, but reaching that conclusion from the language above is a short step.

This provision limits what the author can do with information the author may receive from Amazon. The author is prohibited from disclosing any information regarding Amazon:

  1. Finances
  2. Business Affairs
  3. Communications the author receives from Amazon
  4. Sales Data related to the sale of Digital Books by Amazon
  5. Any other information Amazon provides the author with respect to the KDP program

This covers a lot of ground.

Taken according to its terms, the author is not permitted to disclose:

  1. Finances – Do sales and royalty reports the author receives disclose information about Amazon’s finances if the author shares them with others?
  2. Business Affairs – Is there anything Amazon does that isn’t covered by this term?
  3. Content of Communications between Amazon and the author – Are the contents of sales and royalty reports made available to the author communications? PG thinks so.
  4. Sales Data related to the sale of Digital Books – If Content of Communications doesn’t cover sales and royalty reports provided to the author, Sales Data certainly does. If Sales Data is Confidential Information the author can’t disclose, does that mean sales data regarding number of books sold, prices for those books, etc., owned by Amazon. Again, there is not a specific agreement with respect to ownership of this data in the T’s&C’s, but, as between Amazon and the author, the author will have a hard time arguing he/she is the owner of the sales data.
  5. Any other information Amazon provides the author about the KDP Program – This provision is a catch-all for just about anything Amazon provides the author.

For any visitors to TPV experiencing shortness of breath, PG will point out the boilerplate exceptions to the definition of Amazon Confidential Information. Anything that falls into these baskets is not Amazon Confidential Information even if it’s described in the first part of Paragraph 7 as Amazon Confidential Information (aren’t contracts wonderful?).

  1. information that (A) is or becomes publicly available without breach of this Agreement,
  2. information that (B) you can show by documentation to have been known to you at the time you receive it from us,
  3. information that (C) you receive from a third party who did not acquire or disclose such information by a wrongful or tortious act, or
  4. information that (D) you can show by documentation that you have independently developed without reference to any Amazon Confidential Information

If anything in complex Ts&Cs is straightforward, the exceptions to Confidential Information are.

  • If other people know about it without you telling them, it’s not confidential.
  • If you knew it before Amazon told you, it’s not confidential.
  • If somebody besides Amazon told you and that person got the information without committing a bad act, it’s not confidential
  • If you figured out something that Amazon told you, it’s not confidential.

So where does PG end up on this issue?

First, the standard disclaimers –

  • This is not legal advice, you obtain legal advice by hiring a lawyer (and hopefully paying a lawyer) and not by reading a blog.
  • PG could totally be wrong about this.
  • PG spends more time before he provides legal advice than he does before he makes a blog post.
  • PG might have missed a piece or lots of pieces of the KDP Ts&Cs that totally obliterates his reasoning.
  • PG typed this post without reading it.
  • PG could be high on Coke Zero and out of his mind.
  • Those monkeys in the corner of PG’s office might not be real.
  • Ditto for the aliens looking in through PG’s office window, one of whom looks like Jeff Bezos in disguise.

With the standard disclaimers firmly before you, PG thinks:

  1. Amazon probably owns and controls the data related to the ebooks (and other books) it licenses and sells to its customers.
  2. This data includes how many books it sold that are written by a particular author and how much money it paid to the author.
  3. If Amazon owns the data, it could release the same information as Data Guy publishes to the whole world if it wanted to do so.
  4. If Amazon owns the data, it can share as much of the data as it wants to with third parties, including Data Guy, subject to whatever limitations it places on Data Guy’s use of the information.
  5. In their disclosures of information, both Amazon and Data Guy should be sensitive to the privacy issues of authors even if they are not contractually required to do so.

One of the aliens just brought a pizza into PG’s office as a sign that aliens want only peace. There is no spinach or canned tuna on top of the pizza, so PG will close for now.

114 thoughts on “New Author Earnings Report”

  1. This is an example of the power of data aggregation, big data if you want. I may be wrong, but I believe Data Guy only collects public data, data that is in the public domain and freely available to anyone. He takes this data, massages it together, and performs calculations that reveal facts that in another era could only be found in private account books, or impossible to obtain at all.

    This can be scary. We’ve probably all heard the story about Target’s ability to predict pregnancy from a customer’s buying patterns and embarrassed a young girl by sending her ads for diapers and bassinets before she told her parents.

    Data Guy’s information is an example of the same thing. If you don’t want your information captured, quit selling through outlets like Amazon who reveal your data through stuff like rankings and product placement. Don’t complain if somebody like Data Guy can figure it out. If you like the way Amazon sells your books, you have to accept that the information is public.

    Of course, if Data Guy has actually dipped into private data, he is liable, but if he is just making clever inferences based on computing power and public data, he has done nothing wrong.

    I realize this is tough, but we’re in the 21st Century now and things have changed. Discomfort with the data that aggregation reveals is understandable, but you have to take the good with the bad.

      • If I were DG, I would worry a lot more about Amazon trying to get him for hacking than authors out to get him for invasion of privacy. I think DG has said that Amazon has no issue with what he does.

        • But he wasn’t selling a derivative of that data. And scrapping the rankings and metadata a couple times a year is different from a realtime/daily feed.

          I tend to wonder how long Amazon will consent via inaction to this. If the currently annoyed authors take their grievances to Amazon there might be some quick clarification.

          Decidedly interesting questions at issue.

          • I agree that DG has changed the game by offering a paid service and more frequent scrapes. If Amazon decides to try to stop DG, it will be interesting. The legality of web scraping is still murky.

    • Wouldn’t the information be just as useful using numbers instead of names? That is, not listing the author names AT ALL, but the rank of 1-50 or 1-25, etc, and whether they were indie/small press/Big Five. That’s the useful part, no? Would that satisfy the authors who feel they have had privacy violated or who see ethical issues?

      • Numbers would not be as useful if you are trying to negotiate with a particular author or publisher where author identifiable sales history could be of considerable help in determining the value of rights being discussed.

  2. I understand why people are upset, but it does seem to be the world we live in now with the data out there to be discovered by anyone with an interest. Anyone who is in the public eye gets their income published online. This is like the Forbes list. (And could actually result in some indie authors ending up *in* the Forbes list for authors.)

  3. You do not have any “right to privacy”. People talk about it it, but no such right exists. The best these offended parties can hope for is a tort for “disclosure of information”, but even there business concerns (which a self-publisher is one) receive less protection and public figures (which most of these authors are) even less.

    • A right to privacy is recognized in the UN Declaration of Human Rights and in other international agreements and treaties, but not explicitly in the U.S. The U.S. constitution interdicts unauthorized government search and seizure, but that is not what most people think of as privacy today. Slander and libel laws are intended to prevent malicious broadcast of information, but are hard to prove, especially, as you point out, for public persons, because truth is a defense and malicious intent must be proven.

      In case my excess verbiage is confusing, I agree with DaveMich. 🙂

      • What about the legalese of, “Expectation of Privacy”? I’ve heard that used before to mean that if you take steps to keep things private, and you have a reasonable expectation that it would be, then it is private. That is just supposition on my part. Any thoughts?

  4. Based on that analysis it seems that the ones in breach of a contract are the ones who shared their sales/borrow data with Data Guy to then allow him to accurately calculate how the publicly available rank information correlates to number of sales and borrows and then to revenue. No way to correlate ranks to an estimated number of sales without someone providing that information based on their own experience.

    But now that that cat is out of the bag…

    I don’t see how Data Guy can be sued for aggregating what is public and selling that aggregated data. It seems to me Amazon is the only one in a position to go after the authors who shared their sales data with Data Guy and they have no apparent inclination to do so. Unless the authors who shared their data did so under a contract with Data Guy that would cover this? (Which I highly doubt.)

    • If the big publishers are on-side with Data Guy, I suspect they are supplying at least some of their sales data (which should make estimates off sales rank a lot more accurate).

      Big publishers won’t be on the same terms as KDP for data usage, and should have access to something a lot more substantial than an indie’s author dashboard, probably with a dedicated data interface rather than scraping web pages.

  5. I don’t think Amazon or DG (why are we still protecting his name?) are going to be able to just wave the TOC at this and make it go away.

    This is just getting started.

  6. Everyone has the same access to the data. I could manually scan each Amazon book page and capture all the raw data. There is no law against it. It is all right there in the HTML code. What DG does is analyze that data based upon informed assumption. Again not against the law.

    It would be different if he was obtaining private data. but he isn’t. He is obtaining public information.

    It is the same as if I put the phone book (Remember those) into a database and used that information to draw conclusions.

    • At what point does data, even exploited data, become business information?

      And when does a TOC that reads “We own everything” cross the line? Does Apple give all of our facial ID’s to the NSA for a price because they declare they can in the TOC? Does 23&Me sell our DNA to the FBI for a price under the cover of a TOC they wrote themselves?

      If DG can exploit book data and sell it for a price that’s out of reach of 99% of the people he is obtaining it from, is this an unfair business practice?

      If Randy/Penguin gets my data for a price, can I get theirs for a proportional cost based on my income vs theirs?

  7. As I understand the situation the TOS is essentially between Amazon and the Author/Publisher. Whilst it may establish that the Author has given the rights concerned to Amazon and so has no cause of action in respect of its use, it does not establish the same for DG. DG seems to have used public information combined with information contributed voluntarily. It seems that all or at least the vast majority of the information used in the report is available in various places on Amazon’s websites. The software DG uses simply gathers and aggregates this information. It would seem to be an almost impossible task for Amazon to prevent this whilst maintaining the existing level of functionality and convenience for authors and customers.

    I am also a fan of Dean and Chris and of DG and Hugh. The fact that the AE report has reportedly been changed online to blur further names and possibly sales figures may well mean that DG is responding to requests for privacy at least in the publicly available service. As for the paid private service? Credit reporting agencies seem to report such personal financial information with impunity every day.

    I expect Dean’s anger to cool and the matter to be resolved amicably without legal action, despite such action likely being futile on Dean’s part.

  8. Thanks, PG for the great look at this issue. Very well reasoned, even with aliens looking over your shoulder. (grin)

    I do understand the information is “public” sort of. And I have always admired Data Guy’s ability to go get this information to help us all. I have praised him many times.

    And as a business person, I have no issue with a person trying to figure out a way to make a buck on their hard work and their knowledge. I admire that, actually.

    And as I have said numbers of times today, just because a person can do something (like scrape this information) doesn’t mean they should. But still, used correctly, I still have no issue and have been surprised that up until now Amazon has had no issue with this. (They might now since it is their publishing arms numbers and their authors whose numbers are being released to their competition.)

    Where my problem with this come in has to do with the fact that each indie author are businesses. Yes, I know a lot of indie authors don’t want to think of themselves as a business, but we all are.

    And business information, especially financial, is a very important aspect of any business and staying in business, from making deals to getting loans to attracting new clients. And that kind of business information is owned by each business and released as needed only to parties that need the information.

    Data Guy, even if what he is doing is legal on the face, has decided to take all of our personal business information that he has the ability to find and release it to companies who can afford to pay.

    To me that is over a line of trying to help out publishing (and make a buck) to selling business secrets. This information will actively hurt many, many indie and small press writers in so many ways.

    I do not want any of my information sold, or any information about my publishing companies or my pen names. Yet that is what Data Guy is going to do without my permission. He is selling all my information and all of yours as well.

    Passive Guy. I know your wife is a top romance author. How does she feel about all her sales and income information suddenly being available to anyone with a few bucks?

    So it is no wonder so many writers who understand business are up in arms about this. Got a hunch, as someone said, this is far from over.

    But Data Guy and Hugh, it is not too late to pull back and at least give all of us who don’t want information out there a chance to opt out. Use our general information with no names attached.

    What is funny about me being sort of out front on this is that my books per title don’t sell well on Amazon. I make most of my money from my books in all the places Data Guy can’t get to, such as movie options, overseas sales, secondary markets, kickstarter projects, book funnel sales, sales through eBay, sales through our own B&M stores, and so on and so on. But I still don’t want that information on my 400 titles out there.

    So thanks, PG on the detailed look at this. Wonderful.

    • And that kind of business information is owned by each business and released as needed only to parties that need the information.

      Looks like DG used Amazon business information that Amazon made public.

    • I do not want any of my information sold, or any information about my publishing companies or my pen names. Yet that is what Data Guy is going to do without my permission. He is selling all my information and all of yours as well.

      It is not your information. It is Data Guy’s information about you. Just because it is about you does not mean you own it. Data Guy owns it, and he can sell it if he likes. It’s called “market research” and “competitive analysis” and it’s all perfectly legal. Deal with it.

    • Dean,
      Thanks for your leadership on this. I agree with you completely. They had no right to publish our names and data without our permission or to sell and profit from our data, especially when we have reason to believe he’s WAY off on a few things. Just because you CAN doesn’t mean you SHOULD. Even businesses have a right to keep their proprietary information private from competitors.

    • Thanks for your comments, Dean.

      Privacy in an online era has been and will continue to be contentious for quite awhile. There are discussions and fights about this topic happening in many countries throughout the world. Revenge porn is one of the more recent developments in this area.

      The twist for authors is that their names become highly important marketing tools. A lot of people want to read the next Dean Wesley Smith or Marie Force book as soon as it’s available.

      The titles of the books change, but the author’s name is the constant for marketing and promotion purposes.

      Some might suggest using a pen name to obscure an author’s real identity. Done with a reasonable level of care, such a strategy can work for a new author with a strong desire for privacy.

      However, for mid-career authors, this is more difficult. For one thing, a lot of their books exist in their real name. For another, using a new pen name without disclosing your identity puts you back at the beginning of your career and wastes the cumulative effect of all of the hard work you spent building your brand to reach mid-career success.

      The right to be forgotten is a theme in some privacy discussions and legal bases have been developed for it in the EU. However that right is essentially to have public references to an unfair connection to a terrible past event removed from some databases.
      A teenager who was involved in a highly-publicized crime but who has grown into an exemplary adult with 30 years of sterling behavior is one example of a potential beneficiary from a right to be forgotten. The idea is that the public record will be cleansed of information connecting his name to the crime.

      However, in this instance, you don’t want to be forgotten and you do want to be connected to the books you’ve created during your career. You don’t want to be connected with the income you’ve acquired and continue to acquire from your books.

      I wish I had a good answer, but I don’t at the moment.

      • I guess Fortune Magazine needs to stop giving us the annual lists of the richest people in the world, then.

        And of course, every bestseller list must go.

        Heck, let’s just redact every author’s name, everywhere. I’m sure people will get used to looking for your books under Author #2678456671.

        • People who aren’t affected tend to have a careless ‘so what’ attitude. But what happens if or when it DOES affect you.

          First they came for the Socialists, and I did not speak out— Because I was not a Socialist.

          Then they came for the Trade Unionists, and I did not speak out — Because I was not a Trade Unionist.

          Then they came for the Jews, and I did not speak out— Because I was not a Jew.

          Then they came for me—and there was no one left to speak for me.

  9. Thanks PG, this was my favorite part: “PG could be high on Coke Zero and out of his mind.
    Those monkeys in the corner of PG’s office might not be real.
    Ditto for the aliens looking in through PG’s office window, one of whom looks like Jeff Bezos in disguise.”

    On another note, I dont think one can predict how this all will shake out, from ANY side. As someone upstream said, it’s the beginning.

    What I heard/read/ saw firsthand, was that a lot of work went into this report. That over the time it was released ‘partially’ recently, certain names and sales amounts were blurred out that werent originally blurred out, til to my mind, that part of the report gave no info of use to publishers or to authors who are interested in who/what numbers are ahead of them, genre of those authors, etc.

    I saw that this hard work has also had the boon of becoming ‘a product’ to sell to publishers and perhaps agents for a price, no longer sharing the entire reports with authors and other interested parties who have been early supporters with praise and encouragement.

    I saw something that we’ve seen countless times before wherein a company that was home grown, wants to/ is invited to vault into a bigger pond. There is no blame to that at all; all power to that roll of the dice.

    But there is often a sacrifice of openness and sometimes personal relationship too, in order to make ‘the product’ seem/be more valuable to the targets who are no longer the grass roots. Has happened with softwares, apps, access to libraries online, aggregations like OED online, journals, and literally thousands of other once open, easily affordable, and free products.

    ‘The leap to larger’ is a path that has sinkholes often, going from serving the grassroots, friendships and the support, validations, dense feedback to help shape the product, and creative support of early users/supporters==all those can be left in the dust in the even thoughtful process of maximizing one’s now better formed, or well formed ‘product.’

    Medical records under our current law are confidential unless the patient signs a release for specific releases to specific agencies. Privacies of certain kinds are guaranteed by law. Including communications between licensed helping professionals and clients. And while it is true, that a lot of the current cold cocked culture plays fast and loose with many things, including people’s privacy and that not all aspects are legally protected, since forever, nonetheless: ::::ethics has been higher than the law::::.

    It just depends on what persons want for human relationships vs money opptys. Sometimes both can be accommodated well. But it would mean not demoting either. It’s a hard path to follow for it often takes as much time to support friendships and allies as it does to create a project… esp considering that once one unveils a good project that is a go, soon others who see the $ outcome gear up to rapidly compete with same and similar, only better, more insider, more accurate, faster and cheaper.

    I wish everyone luck, and Dean and Kris, to my .02, ought have their preferences considered and brought to equitable conclusion, and others who wish same, if one is trying to balance support from early helpers, valued relationships– with oppty to make new $.

    That info may be public if one wants to pick the meats out of the walnut piece by piece, or make a crawler /scraper to do so, to me isnt the point.

    Out here, we say no matter which stallions’ ‘contributions’ sell for the most $, if you kick the wranglers that helped to breed, raise, groom, heal and feed those horses, you’ll soon have no one to help you in good will with the next foaling and raising. This means that the ranch/ feed/ vet/ vast community of relationships are not just between the humans from the different layers of a successful ranch, but also deeply tied in heart to the horses, and their greatest potentials, as friends and also as working horses.

    Who is the horse and who is the human? I think most of us are both… in need of others, and also wild creative beings

  10. My first throught when I heard of the new business was: cool.
    The second was: how are the BPHs going to use the data they buy?
    Will they identify rising Indies and target them to try to draw them in?
    Will they Identify Popular Books And Series And Dissect Them For their “the same but different” acquisitions?

    Hollywood has from time to time to time replicated successful movies from abroad. Sometimes directly and paying but on occassion less directly and without paying.

    The BPHs’ track record does not reassure me.

    Interesting Issues, as I said.

  11. I think the opt out would help us too. We consider sales numbers, topics that sell best, and outlets we partner with, proprietary info that helps us to continue and to create/expand _AWAY from the bigger entities, which we already have challenging relationships with over past publishing projects.

    Dean WS put it well: At our little publishing company, we’d concur. It’s back to that idea of scraping in order to feed the very machine we want to not be a part of nor give any advantage [again] over us.

    “Where my problem with this come in has to do with the fact that each indie author are businesses. Yes, I know a lot of indie authors don’t want to think of themselves as a business, but we all are.

    “And business information, especially financial, is a very important aspect of any business and staying in business, from making deals to getting loans to attracting new clients. And that kind of business information is owned by each business and released as needed only to parties that need the information.

    “Data Guy, even if what he is doing is legal on the face, has decided to take all of our personal business information that he has the ability to find and release it to companies who can afford to pay.

    “To me that is over a line of trying to help out publishing (and make a buck) to selling business secrets. This information will actively hurt many, many indie and small press writers in so many ways.

    “I do not want any of my information sold, or any information about my publishing companies or my pen names. Yet that is what Data Guy is going to do without my permission. He is selling all my information and all of yours as well.

    …”So it is no wonder so many writers who understand business are up in arms about this. Got a hunch, as someone said, this is far from over.

    “But Data Guy and Hugh, it is not too late to pull back and at least give all of us who don’t want information out there a chance to opt out. Use our general information with no names attached.”

    • Hmmm, I guess one could ask Amazon to remove their book rankings and remove their books from rankings if they don’t want people to see how well/poorly their books are doing – but that might make them harder to find (I remember seeing something about a writer being upset that Amazon had removed their rankings …)

  12. PG, given this analysis, what do you and your monkeys have to say about Data Guy selling this data, at the author-identifiable level? (And that’s all of us, folks, not just the top 50.) (h/t to kboards for identifying)

    And to my fellow authors, how do you feel about this information being available only to companies with annual revenues of $10M or more? (e.g. not indies)

    • If you scroll down to the bottom right of bookstat, the links to Privacy Policy and Terms & Conditions lead no where.

      While the rest of the site/web page that sells the data is mighty fancy/detailed. That shows what the priorities are.

      Question: How was it determined that bookstat = Data Guy?

    • Debora, I liked the ideas you shared in a comment on Dean’s blog:

      To be clear, I think Data Guy has the right to monetize this. Carefully.


      There’s incredibly useful analysis that could come out of this data, particularly drilling into genre, price, seasonal and daily sales fluctuations, optimal release strategies etc etc etc. In aggregate, and available to everyone to buy…

      Yes. And combine that with Dean’s suggestion—

      …let indie authors opt out of using their names. Use the data, sure, as part of the overall patterns which is valuable…

      —and then the product becomes something that is useful, respectful, monetized, and ethical.

      • Thanks, J.M. I’m the biggest fan of data there is. I’ve used Data Guy’s data to make key business decisions. I deeply understand the value of the data, but there are some really important access, ethics, and transparency questions that need to be answered here.

        I suspect, however, that this may be taken out of indie hands very quickly. I’ll be shocked if Amazon and/or some of the bigger publishers don’t move on this – and that may lose us this data source altogether. Which would be a really crappy outcome.

        • My worry exactly, Debora. And I also agree with much of USAF said.

          This might be taken care of very quickly in other ways since this is exposing authors in Disney and the Amazon imprints, and all their personal information as well. Not two companies I would want to go directly against if I had my druthers.

          Everyone is saying this is technically legal because it’s all out there. I am not disagreeing with that. And I could sit on the street outside a house and tap into the house’s wireless and get the occupants personal information as well. All legal because I would be on a public street. And I suppose as many do, I could sell it and hurt the people in the house.

          I just thought, and still do, that Data Guy and Hugh were above doing that. They could make great money in this without ever exposing anyone to damage. This is just the wrong direction in a number of ways and I hope they make a quick shift.

            • Randall, from the Bookstat website. *Emphasis* mine.

              “From the largest Big Five trade publishers down to the scrappiest garage micropresses, *to sales from Amazon’s in-house publishing imprints* and format-dominating Audible Studios to J.K. Rowling’s Pottermore — data that you’ll find nowhere else — even the sales of individual self-published authors: it’s all right there, live at your fingertips, ready for you to ask it the questions that drive your business.”

                • Go to the bookstat website. Enlarge the screenshot of the dashboard. It shows top publisher income for romance (US only), Apr-Sep. Of the top 10 publishers visible, two are Amazon imprints.

                • I’m trying to determine if he’s doing this with Amazon’s blessing. If he is indeed showing the data from their imprints, that makes me feel that he is not.

                  So, how is Amazon going to feel about him selling their data, pulled from their website, to their competitors?

                  Technically legal? Okay. But I bet he’ll get a response from Amazon if he hasn’t reached some form of agreement already.

                • My suspicion is that Amazon was well aware of his actions as Author Earnings. But that was all aggregated and free. This is both highly specific and identifiable information, and not free. That crosses into entirely different galaxies of concern.

                  I just can’t fathom any universe in which Amazon is happy about this. And they don’t need for it to be illegal to shut him down.

                  I also imagine this might not be all that healthy for him as an Amazon author, although the relative earnings from his books and from selling our data might mean that the financial implications are minimal. It’s possible that having agreed to the Amazon TOS as an author restricts what he can legally do as Bookstat, however. I will wait for the real lawyers to weigh in on that one.

          • I agree with you, Dean. Also, technically legal and something Amazon is willing to let continue are two really different things…

            I also think the Bookstat team may end up having a very large problem because they are representing this as “live sales data”. They claim to track 96% of ebook purchases (and compare to two services that I believe do track actual sales, if incompletely.) They claim to track ebook sales, from Amazon, for example. They make statement like this. “Bookstat is the only industry data service that tracks all online book purchases at the retail point of sale regardless of publisher type.”

            That sounds like actual sales data, but that’s not what their data really is. It’s a guess. A very sophisticated one, but a guess. But by *representing it* in language that makes it sound very much like actual sales data… I’m not a lawyer. But I’m guessing a lawyer could turn that into a problem pretty damn quickly. Both because that data actually is private (or at least, if I understand PG correctly, owned by Amazon), and because claiming you have sales data when you don’t is misleading – and it’s misleading with some large and potentially annoyed brand names involved.

            • Both because that data actually is private (or at least, if I understand PG correctly, owned by Amazon),

              Two critical data elements for this type of analysis are retail price and rank. Amazon posts both to the whole world.

              It doesn’t matter if both pieces of data are private or owned by Amazon. When Amazon broadcasts them to the world, they become public information. Amazon makes no attempt to keep either confidential.

              • Terrence, I don’t think I’d want to be on the receiving end of a flotilla of Amazon lawyers making a different argument ;).

                But in the end, it doesn’t matter what a bunch of armchair lawyers think is legal. There are other ways to kill Bookstat’s data collection dead.

          • Didn’t somebody (Bookbub?) get in trouble for using metadata from Amazon’s pages to sell books on their own site? Would selling data based on info posted on Amazon’s pages be any different?

            I’m not sure that authors have a legal claim to it, but it sure seems like Amazon would. And just like any blog post or photo, although it’s publically available, that doesn’t mean others are free to monetize it. Even though it’s aggregated, it’s still derived from their information.

            And as you mentioned, if Disney and such take issue with this, I can imagine that a cease-and-desist letter from Amazon would soon follow.

          • I disagree with your analogy. It is against the law to enter someone else’s computer and obtain information. It is not against the law to go to their web page and capture information about them. The information has been broadcast by Amazon. That is very different than breaking into my computer.

      • I’m not a fan of opt-outs.
        The uninformed invariably end up on the wrong side of those!
        Opt-ins are preferable for personal data.

        I see nothing wrong with publishers seeing the performance of their titles and imprints. Or any competitor willing to open the kimono.

        Or the general trends and eddies of the industry or its various sectors. But tracking somebody else’s day to day performance? That’s what the russians and chinese do. And those are considered hostile powers.

    • I think Amazon owns the sales data, Debora, and has the legal right to permit or not permit others to access that data from Amazon’s public-facing site or its non-public data repository.

      • Very interesting, thanks PG! So that sounds like Bookstat will require Amazon’s permission to do what they’re doing.

  13. So, if I’m understanding this correctly, DG has data on every book and every publishing house…for a price.

    How much for all data from every Amazon imprint?

    Or is that not available for some reason?

  14. Anyone can crawl Amazon’s public pages and make a good stab at back-working that creator’s profits using one’s own profit per category/author rank. I could do it now for competitive intelligence purposes. I’ve written crawlers that chunk Amazon, the real problem is that Amazon dislikes you doing it and makes it very hard to do (and changes code regularly to break your crawl). They can also ban your IP addresses if the fancy takes them and kill your business dead. If J.B. decides someone making his secret sales sauce transparent is not good for Amazon, it’s game over.

  15. Yeah, I can’t see how scraping information that is publicly available from major retailers, extrapolating it, and presenting your findings would constitute a breach of either business ethics or the law. Reports like this are standard in the business world. We get offers to buy reports that purport to inform us of our competitors numbers all the time in my day job industry. It would only be illegal if the company putting together the report hacked into the computers of our competitors to get it.

    Now, publishing something like that on a blog post, in a community that you are intimately involved with, and whose members gave you the information (illegally) to start this whole data scraping thing in the first place? Yeah, I can see how that pissed some people off, and I can see why DG chose to redact that information (however ineffectively) from the Author Earnings website.

    But the fact that he is monetizing his scraping data, and giving specifics? Too bad. That’s life. The only thing he couldn’t do would be to give the actual sales data he received from specific authors to tune his algorithm. Those authors may have a legitimate legal action if their names and numbers are included in his report.

    • But Data Guy’s not getting publicly available information, he’s got bots going through every inch of Amazon finding out stuff that Amazon basically tells us we aren’t allowed to know. Geesh, it’s not like he’s looking at every book’s sales page, taking the ranking at the moment and putting that before those earning more than $10 million. It’s a lot more complex.

      I personally want Amazon to bite the head off this snake and end it all, soonest. This kind of thing makes me sick, and to see people defending public release of people’s earnings without their written permission is beyond the pale. We do sign contracts saying we won’t discuss or release any data about earnings.

      • “he’s got bots going through every inch of Amazon finding out stuff that Amazon basically tells us we aren’t allowed to know.”

        Except, you know, he’s not. The only thing the spiders (bots) are doing is collecting the ranking and the book’s price. That Amazon publicly posts. The rest is an extrapolation based on information that authors give him. He’s not hacking in to amazon’s servers and accessing private data. He’s guessing based on the information amazon gives out for free, and his own knowledge of how the system works, using data authors (and now big publishing companies) freely give him to extrapolate the rest.

        Calling that a violation of privacy is like trying to complain that someone figured out you went to the beach over the weekend because you got a tan and are complaining out loud about sand in places it doesn’t belong. People are free to put together whatever information they can about you based on things they and others can observe about you in public spaces. Then, they are free to do what they like with that information, within the confines of libel and slander laws.

      • “We do sign contracts saying we won’t discuss or release any data about earnings.”

        And on that note, you have a really good point. Indie authors who give their data over to DG may have violated their Amazon terms of service. But that doesn’t mean DG violated anything.

  16. I don’t really see how, in an internet age where someone can type your name into Google, likely find your address, phone numbers, and click on an overhead map of your house and backyard, not to mention run a credit report on you, etc., that someone making available how many books you sell (publicly) is the bridge too far.

    Is it ethical? Yeah, I think so. If, for some reason, you want to keep all your sales info secret, you can sell all your books through your own website and directly to shops. But providing data on how your sales compare to other sales on Amazon, and compiling that with other public info, seems legitimate.

    Is it in the public interest? If it’s accurate, I believe truth wins out. Yes, only big companies will be able to pay for some of the information, but much of it will leak out. It’s relevant to other writers, and even the public, as to what’s popular. I personally get more upset by writers who dishonestly try to inflate their numbers and this might discourage that.

    What if it isn’t accurate? Well, if your sure it isn’t accurate, than say so and explain why. I buy what Dean is saying about how the info on his books might not reflect all his potential income, but okay, it still doesn’t mean that their stats on what is selling big across various platforms isn’t correct. Or close enough to be of value to those who buy it. In this case, it’s mostly up to the buyers of this information to worry about it’s accuracy. If it is consistently inaccurate it won’t last very long.

    As for whether Data Guy or Hugh should know better because they’re the “good guys,” if someone can do it, someone will. So better that “good guys” are doing it than “bad guys.” (Cause the good guys are sharing some valuable info for free.) But if this works, there will be others who do it too who might not bother to share information or worry about blurring anything.

    • “But if this works, there will be others who do it too who might not bother to share information or worry about blurring anything.”

      That’s just it. He *is* sharing it without blurring anything. And he’s *not* sharing it with indie authors (minimum of $10M in annual revenues required to be a client of his corporate data service.) We’re getting some very limited looks at a huge data set. His paying clients are getting a dashboard that lets them drill down into the whole thing. And he’s selling it using language and branding that makes it very much sound like actual sales data.

      What the ethics of this are, are up for debate, along with the legalities and what Amazon might decide to do in response. But the facts seem pretty clear at this point. He is selling individually identified author data on *all* of us that we can’t see, can’t buy, can’t verify, and can’t ask to have removed.

      Dear fellow authors – how do you feel about that?

      • He is sharing it with us. He’s said he will continue to do Author Earnings reports twice a year. An individual author can already track their own daily sales. As for trends, the twice a year report sounds good. I think he shouldn’t share names though, even if it is legal.

  17. I’m not sure what the fuss is. It looks like a market analysis service is being provided by DG, using publicly obtainable data. Every mature industry in the world uses market analyses. The degree of granularity may be alarming, but if it’s public, it’s public. You might as well complain about your sales rank being made public.

    • I think you hit it on the nail right there, Gene. The keyword is “mature” and the self-pubbing industry clearly is not. Yet.
      If anyone thinks that DG is the only person with a spider getting information out of Amazon and/or other marketplaces, I don’t know what to say to them. Any business worth its salt will have people (entire departments) dedicated to studying their competition. That’s called market analysis and its usually done using public data. So, any publisher who considers the self-pubbers competition to them will have people working at it already. They will have estimates of how much who is earning. Those estimates might be worse or better than what DG provides, but the point is they know. Because, yes, the sales ranks are public.

      People might as well take their books off the Zon and sell it on their sites. That could give them a bit more privacy.

  18. I tend to disagree with DWS about most things, but on this, I’m 100% with him.

    Data Guy established himself as a friendly in the indie community. We didn’t mind him collecting information and sharing it with us as a goodwill gesture to help the indie community out. But now, he’s turned around and sold that information–in greater detail, it appears, than he’s sharing it with us, and specific information about individual authors rather than keeping it anonymous–to our direct competition, companies that make up a part of the industry that would cheerfully eradicate us (in a business sense) if they could, and they fully intend to use DG’s data to attempt to do this. And as if that weren’t enough, DG has hidden most of that data behind a paywall so huge that no indies can access any of it, so it becomes the sole weapon of our competition.

    Whether or not he can legally get away with it is entirely beside the point, PG. DG has been a friend to the indie community, as far as most indies are concerned, and now he’s literally working for the enemy, selling specific information about us, without our permission, to people who want to destroy us. Why would anyone wonder why so many indies are upset? It’s not about legality. (Well, it could be, but that’s not my primary beef, and not the only issue.) It’s about trust.

    • The qig5 don’t want to ‘destroy’ you, they just want to jump on your bandwagon if you look like the next ‘best seller’. And we keep seeing writers offer their works to publishers so they can claim they are really ‘published’ (unlike those indie/self publisher types that are merely making good money on their offerings).

      What can they actually ‘do’ to you? They can try to offer you a contract – which you can accept/counter/ignore. They can try to pick out what makes your offerings sell and try to get their pen of writers to write the same type of stories – and be late to the party yet again.

      They might dilute the market you’re selling in, but they can’t ‘destroy’ you.

      • By “us” I didn’t mean individual authors. I meant the indie publishing industry. They very much want the indie publishing industry gone and them back in full (or almost full) control of what books get published and bought.

        • But that’s just it, their sword can’t cut this blizzard – the individual snowflakes simply pass around it. To kill the indie writer (or at least keep them from reaching their readers) they’d have to kill the internet (or create some mighty tall walled gardens.)

          Without new laws on the books – laws that would actually hurt them more than it would the indie – they can’t. If they could stop Amazon (and other sellers of indie) they would have by now. They can stick out their mittens and see if any of those individual snowflakes will stick to them, but they can’t grab to many at a time (and the tighter they squeeze the more that will get away!)

          Nothing is stopping trad-pub from doing their own look-ups – or paying someone else to do it. Indie writers are turning hobbies into profit, so is DataGuy, there’s no real difference. As far as naming names, is it any different than Amazon naming their top 100 books ‘and’ their authors?

          If someone doesn’t want their name showing up when their books do well then I hope they were selling it under a pen name …

    • Not sure if all this might fall under the price-fixing/anti-trust laws rather than data privacy. I used to work for a large software firm and we had to be very careful about what data we obtained about our competitors wages and financials not found in annual report filings.

      • Not sure if all this might fall under the price-fixing/anti-trust laws…

        I’m wondering about this as well.

        If I, as a private individual, go around to all the grocery stores in my area making notes on the prices of the items I buy, so that I can shop cost effectively, that’s fine.

        But I believe that if an employee of, say, Safeway were to do the same thing, with the intention of the recorded data being used by the local Safeway to set their prices, that would be illegal.

        (Note that I am a layperson with no expert knowledge on the above. But I do wonder.)

        • It’s called “mystery shopper” and it’s been a thing since the 40’s.

          Companies supplying the shoppers proclaim tbeir service is mostly for companies to self-evaluate. The reality is the big money is in competitor assessment.

          Odds ard Whole Foods is seeing a boost in sales from all the mystery shoppers flooding tbeir stores. 😉

          • It is, by the way, a vehicle for a pernicious scam.

            You are recruited as a Mystery Shopper and assigned to go to a Western Union desk and evaluate their performance. You are given a large check to send somewhere. You keep some attractive percentage of it. Some time after your evaluation you discover that the check is bad and you are screwed.

            Do not accept money from someone only to forward it elsewhere, no matter what the circumstances. It is invariably a way to part you from your money.

            • Most “no skill needed” jobs are scams.
              Most honest jobs require at least minimal training/experience. And for the few that don’t…
              … “The robots are coming!” ; )

  19. I’m an author myself, a huge privacy advocate (to the point where my friends consider me a conspiracy theorist), and I think authors need to take a deep breath, calm down, and step back from this outrage against DataGuy.

    First, this is the problem with the masses. When it comes to privacy issues, everyone gets outraged only it’s staring them in the face and suddenly they realize it DOES affect them personally. In fact, why anyone still thinks any info online is private is beyond me. But we’ll let that pass too.

    The thing to realize is that what DG is doing is not new. Data gathering is being done by a lot of companies, government orgs, and God only knows who else. Market research companies do this. DG isn’t even the only book related companies that does this. K-lytics sells the same info on books.

    For people getting outraged, I understand where you’re coming from, but think twice about this: DG is our friend, has always been. While that may change, at least so far evidently his intent is to help indies. If you all shut him down, it won’t stop your data from being extrapolated and released because it’s already been done, can be done, and somebody out there (big publishers) want it. So if DG is shut down, someone else will come in and fill the hole. And the next guy won’t be so friendly. The next guy probably won’t talk to any of us. Why invite the ire and deal with us? We don’t even pay him.

    So take a deep breath. Express your concern to DG, WORK WITH HIM! I’d much rather he be the one to sort this out, than for him to go away. We’ll never get any info again, and certainly not for free when corporations are willing to pay tens of thousands. And then the next guy/ would be doing it and we won’t even know it. Much like the NSA.

    Or maybe that would make you all feel better because what you don’t know won’t hurt you?

    • It can stand to be said again.

      data ABOUT you is not YOURS if it was not obtained FROM you.

      just because it is ABOUT you does not make it YOURS.

      • So, you’d be fine with your medical records being placed online as long as they were not obtained from your computer? How about your browsing history? Your kids failing grades? Your arrest record?

        That info is all ABOUT you or them, so as long as it comes from a third party its all good?

        Privacy is a thing. Its going to be a thing for some time because technology is always a few steps ahead of the law and regulations. This is a developing issue, and like I said earlier, most of the laws haven’t been made yet.

        • Before HIPAA, what you suggest was in fact legal, and people did object, which is why HIPAA was conceived of and passed.

            • I wouldn’t argue that. Certainly people are upset about this, but what I’m saying is that running around yelling about suing for invasion of privacy isn’t going to get them anywhere when no such restrictions exist.

            • I would agree, except this has been happening in other industries for decades. Estimated market reports based on crawled data is not exactly a new phenomenon. Self-published authors are not employees of Amazon. They are individual businesses. Partners of Amazon and other companies.

        • If you blog about your medical conditions, doctor, medications, you have just made your medical records public. If you tweet or blog or FB about your kids failing grades: public. I’ve looked up tax records and arrest records online–a lot is public. It’s amazing what people say online (I have said things I wish I could go scrub). Anyone reading my online info can find my medical conditions, medications, when I went to the hospital, when my family members died and from what. People divulge cancer and Alzheimer’s and other diagnoses all the time in my social media connections.

          Medical records themselves are protected BY LAW. Hospitals and medical offices pay people to take care of them and keep them private as required by law. Forms are required for disclosure. Breaching privacy can bring repercussions. The people who handle medical records have to be trained (HI Managers have to take classes on privacy laws/confidentiality, as do coders/billers generally).

          But if you disclose all your med-goodies online, that info stops being private. You revealed it. If you answer questions about it in an interview that someone else posts, it stops being private.

          If an author participates in a site–say a retailer– that gives out information, isn’t that sorta the same as if they agreed to make it public by participating with that retailer, perhaps? AS someone mentioned, an author can sell via their own sites or sites that don’t offer ranking data.

          This Author EArnings thing is murkier. It’s not the author disclosing it and it’s not Amazon overtly giving/selling reports. It would be interesting to see more attorneys weighing in on this for us non-attorneys, as PG has.

Comments are closed.