GitHub Copilot and the Legality of Generative Content
By Adam Pease
GitHub Copilot and the Legality of Generative Content
Microsoft is leading the charge to try and throw out a court case that could spell out the future of generative AI.
Alongside its subsidiary GitHub, and close partner OpenAI, Microsoft has submitted filings to a San Francisco court to have a class action lawsuit about the legality of AI coding technology dismissed. This blog discusses the news and its implications.
Why Is Microsoft Being Sued?
Microsoft and GitHub’s popular tool, GitHub Copilot, works as a virtual assistant for programmers, helping them by automatically completing lines of code, and explaining the meaning of code. In order to build Copilot, GitHub trained the AI model using large volumes of code stored in GitHub repositories.
The news concerns OpenAI as well, recently the recipient of a large investment from Microsoft, whose own AI coding model, Codex, will likely be folded into the Copilot offering in the near future.
Plaintiffs in the case argue that Microsoft illegally scraped copyright-protected code from GitHub without consulting or compensating its creators.
Some critics take it a step further to argue that Copilot actively encourages the plagiarism of code, producing snippets that look, at times, concerningly similar to copyright-protected material. In addition, they argue that GitHub violates the principles of open-source software with its move.
The Future of Copyright for Generative Content
The jury is still out on how copyright law will be affected by the rise of generative content. Cases like this one will function to set the stage for a new era of litigation that will extend beyond computer programming to include art, music, and entertainment media in general.
Many in the tech industry are aware of the complications—Google even decided not to release its recent music generation model, in part due to the copyright concerns. It remains to be seen how these legal cases will develop, and if Microsoft and its allies will succeed in dismissing the case against GitHub Copilot.
On the one hand, AI models do not store or contain the original data they were trained on, so it cannot be said that these tools are actually ‘copying,’ content. But on the other hand, there has been widespread social backlash to the phenomenon of AI companies leveraging text and content from the Internet to train models without consulting its creators.
Bottom Line
The future of generative content could be shaped deeply by the outcomes of cases like this.
Restrictions on the use of copyrighted material in model training could set back the speed of AI progress, but also potentially provide compensation for the content creators whose materials helped enable this progress to begin with.
One thing is for sure, this court case is just the beginning—and we should expect to see many legal battles in this new emerging market.
See Adam LIVE on Tuesday, February 28, 2023 at 10 AM PT / 1 PM ET!
Leverage the Latest Generative Content Tools and Trends
Discussion of the generative content market has exploded in the past year as tools like ChatGPT and Stable Diffusion reveal the power artificial intelligence has to automate critical processes for content creation and business communication.
Emerging AI models make it possible to generate text, voice, images, code, and more, as many exciting developments in open source and emerging SaaS products have the market moving at a dizzying speed.
Join Aragon Analyst Adam Pease on Tuesday, February 28, 2023 where he will discuss how to leverage the latest trends and content in generative content.
This webinar with Adam will:
- Bring you up to date on the latest developments generative content
- Give you tools that your business can leverage right now
- Cover recent research about the trends that are shaping the direction of the market.
Have a Comment on this?