AI reveals potential Amazon, Facebook GDPR problems to regulators

This content has been archived. It may no longer be accurate or relevant.

From c/net:

AI [artificial intelligence] software reportedly uncovered suspected GDPR breaches by Alphabet, Amazon and Facebook.

The software — created by EU Institute researchers and a consumer group — looked at the privacy policies of 14 major technology businesses in June, the month after the EU’s new data privacy laws went into effect, according to Bloomberg.

Researchers named the software “Claudette” — short for automated clause detecter — and Alphabet (Google’s parent company), Amazon and Facebook were among the companies whose policies were under the AI microscope.

It found that a third of the clauses within the policies were “potentially problematic” or contained “insufficient information,” while a further 11 percent of the policies’ sentences used unclear language, the academics noted.

The software also noted that some policies failed to identify third parties that the company could share data with.

. . . .

Despite the software’s findings, researchers admitted that the results of the automated scan “are not 100 percent accurate” since the software has only viewed a small number of policies.

Google insisted that its policy is compliant and highlighted that the updated version doesn’t expand or make any changes to how it collects or processes users’ information.

. . . .

The EU has been enforcing the General Data Protection Regulation since May 25 and the law requires the companies adopt greater openness about data they have on EU residents, as well as with whom they share the data.

Link to the rest at c/net

PG suspects that AI might not be the best solution for reviewing 14 terms of use or similar documents today (he suspects it took more time and effort to create the artificial intelligence application than simply having lawyers or paralegals simply review the 14 documents would have required).

However, over the longer term, he thinks it quite likely that AI apps will become common tools for creating and reviewing legal documents.

From the MIT Technology Review:

Meticulous research, deep study of case law, and intricate argument-building—lawyers have used similar methods to ply their trade for hundreds of years. But they’d better watch out, because artificial intelligence is moving in on the field.

As of 2016, there were over 1,300,000 licensed lawyers and 200,000 paralegals in the U.S. Consultancy group McKinsey estimates that 22 percent of a lawyer’s job and 35 percent of a law clerk’s job can be automated, which means that while humanity won’t be completely overtaken, major businesses and career adjustments aren’t far off (see “Is Technology About to Decimate White-Collar Work?”). In some cases, they’re already here.

“If I was the parent of a law student, I would be concerned a bit,” says Todd Solomon, a partner at the law firm McDermott Will & Emery, based in Chicago. “There are fewer opportunities for young lawyers to get trained, and that’s the case outside of AI already. But if you add AI onto that, there are ways that is advancement, and there are ways it is hurting us as well.”

. . . .

So far, AI-powered document discovery tools have had the biggest impact on the field. By training on millions of existing documents, case files, and legal briefs, a machine-learning algorithm can learn to flag the appropriate sources a lawyer needs to craft a case, often more successfully than humans. For example, JPMorgan announced earlier this year that it is using software called Contract Intelligence, or COIN, which can in seconds perform document review tasks that took legal aides 360,000 hours.

These programs are, simply put, changing the way legal research is carried out. Workers used to have to trudge through stacks of dusty law books and case files to find relevant information.

. . . .

People fresh out of law school won’t be spared the impact of automation either. Document-based grunt work is typically a key training ground for first-year associate lawyers, and AI-based products are already stepping in. CaseMine, a legal technology company based in India, builds on document discovery software with what it calls its “virtual associate,” CaseIQ. The system takes an uploaded brief and suggests changes to make it more authoritative, while providing additional documents that can strengthen a lawyer’s arguments.

“I think it will help make [entry-level lawyers] better lawyers faster. Make them more prolific,” says CaseMine’s founder, Aniruddha Yadav. “If they are handling a couple cases at a time, they will learn the law faster.”

. . . .

Other legal tech startups with AI at their core have been gaining steam as well. Kira Systems, which makes a contract review platform, counts four of the top 10 American law firms, as well as several international firms, as clients. Meanwhile, investors plowed $96 million into Zapproved, a startup that makes a cloud-based electronic discovery tool. Overall, it’s been a banner year for new legal tech companies, with funding up 43 percent in the first three quarters of 2017 compared with the same time last year, according to a report by the research firm CB Insights.

. . . .

There are, however, still obstacles to further adoption of AI in the legal profession. Chief among them is a lack of accessible data to use in training the software. Take the contract analysis company Legal Robot. In order to train its program, a team of developers built their own database of terms and conditions by collecting examples from major websites. But that wasn’t enough—the company also had to strike deals with law firms to gain access to their private repositories. In total, they compiled over five million contracts.

Link to the rest at the MIT Technology Review

PG notes that the MIT article is not the only one about the legal profession which is primarily based upon the methods of practice followed by large American law firms. He doesn’t blame the authors of the article or the firms profiled therein because most of the large sales opportunities for these technical products or services will be in major law firms.

However, only a small percentage of practicing attorneys work for major law firms in the US. About 85% of American attorneys work in firms of 50 lawyers or fewer and about half of all American attorneys are sole practitioners. The legal “market” is really fragmented into business organizations that may have less in common than non-experts first assume.

In PG’s experience, the most technically-savvy lawyers are either in very small firms (many with a single attorney) or as small groups in very large firms (fewer than 5 lawyers in a firm of 500 attorneys, for example).

Tech companies and venture capital firms typically misjudge the true size of the legal market for advanced tech products. The average gross revenue (not profit) per lawyer (not including paralegal, secretary, other support staff, etc.) for the 100 largest law firms in the United States was somewhat less than $1 million in 2017 and about $850K in 2016.

Additionally, the management structure of many major firms is not well-suited for supporting the acquisition of major technology products via a capital spending budget. A typical industrial firm will calculate profits after including all costs, including salaries and bonuses.

Generally speaking, a law firm’s “profits” don’t include all salaries and bonuses the firm routinely pays year in and year out. These law firm “profits” are divided among the long-term partners or major shareholders of law firms and distributed each year rather than retained or accumulated to fund long-term growth.

It is not unusual for the controlling partners/shareholders to take relatively small salaries or draws against distributions based on equity compared to their end-of-year distributions. Using money to acquire new technology products or services typically means that each of the firm’s big shots takes home less money at the end of the year.

The bottom line is that, as a group, major law firms are not big-time purchasers of new technology products and services because of the immediate hit to the partners’ income. For example, it took many years for the original computer-assisted legal research services, Lexis and Westlaw, which had a clear and compelling value proposition for lawyers, to achieve any sort of respectable penetration of and associated revenue from these firms.

10 thoughts on “AI reveals potential Amazon, Facebook GDPR problems to regulators”

  1. I recently looked at the GDPR compliance info. put out by a newsletter client called AWeber. In it, the company makes a clear distinction between the individuals who use their client [to send out newsletters to /their/ subscribers], and the client itself. Thus, while individuals may be GDPR compliant, AWeber continues to harvest information in the background and send it to the US.

    This information is not about AWeber’s subscribers per se but about the people who subscribe to the /newsletters/.

    This is an important point because the subscribers to the newsletters believe their privacy is being protected. That is /why/ many of them decide to opt-in. They do not know that they are also agreeing to AWeber harvesting their data…and using it exactly as before. To track, to share, to aggregate with other information from other sources, to analyse and ultimately to use /against/ those very subscribers.

    I’ve read countless articles describing exactly how all this ‘unidentifiable data’ can be used to identify individuals. It’s ridiculously easy, especially with the types of software available to do just that.

    After harvesting this ‘unidentifiable data’, tech companies send it to the US under the rules of the Privacy Shield:

    https://www.privacyshield.gov/article?id=OVERVIEW

    Given that US companies ‘self certify’ to the Privacy Shield, this basically means business as usual.

    The fact that the algorithm in the article has already found breaches is no surprise at all, because effectively, nothing has changed.

    The onus is still on the user to navigate the labryinth of legalise behind the Privacy Policies of all the major tech companies. Then, even assuming that the user has read and understood said legalese, they are still left with just one stark choice – agree, or pi$$ off.

    Many people justify this ‘do what I say or I’ll take my ball and go home’ attitude as the price we pay for ‘free’. Unfortunately, the belief that data mining is justified has become so prevalent amongst tech companies that they continue the surveillance even when you do pay.

    The GDPR is flawed and doesn’t go anywhere near far enough. But at least it is attempting to free EU citizens from this ubiquitious commercial espionage. Unfortunately, the US probably won’t attempt to free its own citizens from this covert surveillance until some high profile citizens are publicly exposed for going to online porn sites or paedophilia sites or a million other unsavoury destinations on the internet. Because it will happen, eventually.

    • Could you imagine the workload if a few thousand EU people suddenly demand to know what a small one-person site may or may not have on them? As most little sites can’t handle that type of load this makes a great way to close down the ones you don’t like.

      Trust me, I get the ‘don’t sell my info’, but forcing sites to have to disclose what they might have can overload the smaller ones – and backfire if someone claims to be someone else when demanding that info.

      Have you noticed you now have to add your data to each post here? That’s the site no longer linking who it’s seen claiming to be you hitting the site again. Though by the GDPR you can still waste PG’s time demanding to know what he might have saved about you and your visits. If everyone did that it would cut into his time managing this site and possibly his job (and you don’t want to get between him and him seeing his grandkids! 😉 )

  2. Is the legal industry one of those ripe for disruption, but can never be disrupted because it is special?

  3. If A.I. was to the point it could read/understand contracts then they could also write them (and identify best selling books and songs ect …)

    The proper response to this and the rest of the GDPR as it’s currently written is to have your website open to a page that states that as per the GDPR your site does not track EU clients. Then have a yes/no button pair that asks if they are in/part of the EU. Since you don’t track any EU data, the no button takes them to your site while the yes button does nothing … (it is not the site’s fault if it tracks someone who lied about being in the EU. 😉 )

    • The proper response (for an American) is to tell the EU to shove off. If I wanted to follow their laws I’d move there. They can bite me. This stupid law was designed as a income, nothing more. It is impossible for 90% of American business who use mailing list to be compliant… Even is we did have to follow it (which we don’t). Now if you’ll excuse me I’m going to go find some imported tea to dump in the ocean.

      • Oh, come on! You can have all sorts of fun with this! 😉

        ‘This site is non-GDPR-compliant, any/all EU members must delete their accounts and log off at this time.’

        ‘We would love to not track EU customers – but then we wouldn’t be able to remember/process your orders. Please make up your minds and get back to us.’

        • That would be my reaction.

          “This web site is not based in the EU nor aimed at any of its citizens so I do not recognize any direct or indirect EU authority in this venue.

          “I do not particularly care about anybody’s location, citizenship, or private data but there is no telling which government’s agents might be tracking you so EU visitors are warned that they proceed at their own risk. It’s not as if this site is all that important anyway.

          “You have been warned.”

  4. Interesting points.
    Sounds like that tech would be more effectively commercialized as a subscription service ala Lexis/Westlaw or, better yet as an added value extension to those two.

    Or, maybe as a subscription service under AWS, AZURE, Office365, or WATSON, all of which are expanding heavily into “AI” services.

Comments are closed.