How do lawyers who are familiar with software intellectual property rights see the class action lawsuit of code completion AI `` GitHub Copilot ''?

In November 2022, a class action lawsuit was filed against GitHub, Microsoft, and OpenAI, which were involved in the development of the code completion AI service ``

GitHub Copilot '' that learned from GitHub code. Kate Downing, a lawyer who specializes in intellectual property rights in the software technology industry, explains the points in this lawsuit and the plaintiff's odds of winning.

AN Open Source Lawyer's View on the Copilot Class Action Lawsuit – Law Offices of Kate Downing

GitHub Copilot is an AI service jointly developed by Microsoft, which owns the software development platform GitHub, and OpenAI, an artificial intelligence development organization. is. However, GitHub Copilot training uses GitHub's public repository, and there have been concerns about `` outputting copyrighted code '' and `` destroying the open source community. '' I'm here.

And on November 3, a class action lawsuit was filed in the United States District Court for the Northern District of California, led by Joseph Saveri Law Firm, which has offices in California and New York. This is the first lawsuit involving a service that generates what AI has learned.

AI 'GitHub Copilot' finally learned with GitHub code faces class action - GIGAZINE

Downing, who specializes in intellectual property rights related to open source software, explains this class action lawsuit. According to Downing, the items claimed by the plaintiff in this complaint are as follows:

・Breach of contract (not copyright infringement) related to the open source licenses of individual GitHub repositories
・Unlawful interference with contractual relationships (by failing to provide proper license information to GitHub Copilot users to ensure compliance with open source license agreements)
Fraud (related to the Terms of Service and Privacy Policy that GitHub code is not used outside of GitHub)
Reverse passing off based on federal trademark law (due to making GitHub Copilot users believe that the output code was generated by GitHub Copilot itself)
Unjust enrichment
・Anti-competitive behavior
・Breach of the GitHub Terms of Service and Privacy Policy related to the handling of personal data
・Violation of the California Consumer Privacy Act (CCPA) regarding personal data
・Negligence in handling personal data
Civil collusion

What Mr. Downing describes as 'attractive' here is that 'the plaintiff does not claim copyright infringement.' In the debate over AI and copyright, the claim that ``using copyrighted content for AI training constitutes fair use'' is often brought up, but the plaintiff anticipates this defense in advance and copyrights. They are trying to avoid any discussion of rights infringement.

Also, although this lawsuit is a class action lawsuit with GitHub users as plaintiffs, most people who publish code on GitHub have not officially registered their code as a copyrighted work. Therefore, it is expected that the number of plaintiffs will be reduced by more than 99% when trying to file a claim of copyright infringement, as it is necessary to find plaintiffs with registered copyrights. In addition, Downing believes that by avoiding allegations of copyright infringement, he can avoid becoming the final case on machine learning and copyright.

Mr. Downing said that while it was interesting that the plaintiff had developed a claim to avoid copyright infringement, he pointed out that he seemed to misread GitHub's terms of service. According to the terms of service, GitHub users grant GitHub the right to use content to run and improve the 'service', and there is no problem using code for GitHub Copilot itself.

In addition, Downing believes that short lines of code are not subject to copyright protection, claims about presenting attribution information are likely not to apply, and complaints about personal data are difficult.

And Downing also points out that even if the lawsuit forces GitHub Copilot to present license information for all proposals, it is doubtful whether it will be in the interests of developers. In addition, GitHub Copilot's proposal is likely to come from multiple sources, and there is also the issue of which source should be displayed as attribution, or whether attribution should be displayed for all sources. .

Downing explains that GitHub Copilot has a 1% chance of accurately reproducing the original code from the training data, and considering the part that is not copyrighted, the attribution requested by the plaintiff is proposed. claimed to be less than 1% of 'Certainly, the copyright holders affected have rights, but this is not an 'impact litigation.' It looks like a troll if you're being thwarted by people asking you to pay for it.'

in Software,   Web Service, Posted by log1h_ik