> By comparison, Copilot is even more obviously fair use.
Not sure I see it that way.
If I take your hard work that you clearly marked with a GPL license and then make money from it, not quite directly, but very closely, how is that fair use? Or legal?
Copying and storing a book isn't recreating another book from it. Copilot is creating new stuff from the contents of the "books" in this case.
And someone accepts it. Even if suggesting derivatives of licensed code is not a license infringement, then Copilot sure is a vector for mass license infringement by the people clicking "Accept suggestion". And those people are unable to know (without doing extensive investigation that completely nullifies the point of the tool) whether that suggestion is potentially a verbatim copy of some existing work in an incompatible license.
If I suggest whole lines of dialogue to you, the screenwriter, did I write those lines or you? If you change names in those lines of dialogue to fit your story, do you now gain credit for writing those lines?
There are situations where the question is are the mishmashes from Copilot 'fair use'.
But the other, more direct question is ... what about the instances where Copilot doesn't come up with a learned mishmash result? What happens when Copilot just gives you a straight up answer from it's learning data, verbatim?
Then you, as a dev, end up with a bunch of code that is effectively copied, via a 'copying tool', which is GPL'd?
It's that specific case that to me sticks out as the 'most concerning part'.
For your specific case, “take your hard work that you clearly marked with a GPL license and then make money from it”, you don’t even need to rely on fair use. As long as you comply with the terms of the GPL, making money with the code is perfectly acceptable, and the FSF even endorses the practice. [1] Red Hat is but one billion-dollar example.
I understand the concept of fair use (I think) but I can't see how it applies to Copilot.
Google didn't create new books from the contents of existing ones (whether you agree that they should have been allowed to store the books or not) but Copilot is creating new code/apps from existing ones.
Edit: I guess my understanding of fair use was wrong. I stand corrected.
It would help the transformativeness, but it would substantially change the effect upon the market. By creating competing products with the copyrighted material, there is a higher degree of transformative, but you also end up disrupting the marketplace.
I don't know how a court would decide this, but I do think the facts in future GPT-3 cases are sufficiently different from Author's Guild that I could see it going any way. Plus, I think the prevalence of GPT-3 and the ramifications of the ruling one way or another could lead some future case to be heard by the Supreme Court. A similar case could come up in California, or another state where the 2nd Circuit Artist Guild case isn't precedent.
However, where does one draw the line between fair use and derivative works?
Creating something based on other stuff (Google creating AI books from the existing ones for example) would possibly be fair use I think but would it not also be derivative works?
There's no clear line and there can never be because the world is too complex. We leave up determination to the court system.
Google Books is considered fair use because they got sued and successfully used fair use as a defense. Until someone sues over Copilot, everyone is an armchair lawyer.
This is the clearest display yet that moderation on HN has absolutely nothing to do with your purported values like constructive criticism, and has everything to do with whether dang agrees with you or not.
I actually have no idea what you were arguing about, nor which side you were on, nor what your argument was. I haven't paid enough attention to know those things, because (a) I don't want to, (b) I don't need to, and (c) not doing it leaves me in the desirable state of being incapable of agreeing or disagreeing.
It's a happy fact that figuring out people's arguments is often unnecessary for moderating the threads, especially in cases where people are breaking the site guidelines. Everyone needs to follow the site guidelines regardless of what the topic is, what their argument is, and how right they are or feel they are. Please stick to the rules when posting here.
Fair use is a defense for cases of copyright infringement, which means you're starting of from a case of copyright infringement, which sort-of muckys up the whole "innocent until proven guilty" thing. And considering it's a weighted test, it's hardly very cut-and-dry at that.
If you view GPL code with your browser would that mean that your browser now has to be GPL as well? In the sense that copilot is not much different than a browser for Stack Overflow with some automation, why would it need to be GPLed? Your own code on the other hand…
For sake of discussion, it would be clearer to split copilot code (not derived from GPL'd works) and the actual weights of the neural network at the heart of copilot (derived from GPL'd works via algorithmic means).
For your browser analogy, that would mean that the "browser" is the copilot code, while the weights would be some data derived from GPL'd works, perhaps a screenshot of the browser showing the code.
I'd think that the weights/screenshot in this analogy would have to abide by the GPL license. In a vacuum, I would not think that the copilot code had to be licensed under GPL, but it might be different in this case since the copilot code is necessary to make use of the weights.
But then again, the weights are sitting on some server, so GPL might not apply anyway. Not sure about AGPL and other licenses though. There is likely some illegal incompatibility between licenses in there.
As I understand it the things copilot tries to do is automate the loop of “Google your problem, find a Stack Overflow answer, paste in the code from there into my editor”. In that sense, the burden of whether the license of the code being copy pasted is on the person who answered the SO question and on me. If this literally was what copilot did, nobody would bat an eye that some code it produced was GPL or any other license because it wouldn’t be copilot’s problem.
No let’s substitute a different database of for the code that isn’t SO. It doesn’t really matter if that database is a literal RDBMS, a giant git repo or is encoded as a neural net. All copilot is going to do is perform a search in that database, find a result and paste it in. The burden of licensing is still on me to not use GPL code and possibly on the person hosting the database.
The gotcha here is that copilot’s database is a neural network. If you take GPL code and feed it as training data to a neural network to create essentially a lookup table along with non-GPL code did you just create a derived work? It is unclear to me whether you did or not. In particular, can they neural network itself be considered “source code”?
> If you view GPL code with your browser would that mean that your browser now has to be GPL as well?
Some good responses in sibling comments already, but I don't see the narrow answer here, which is: No, because no distribution of the browser took place.
If you created a weird version of the browser in which a specific URL is hardcoded to show the GPL'd code instead of the result of an HTTP request, and you then distributed that browser to others, then I believe that yes, you'd have to do so under the GPL. (You might get away with it under fair use if the amount of GPL'd code is small, etc.)
Or if you simply read GPL code and learn something from it - or bits of the code are retained verbatim in your memory, are you (as a person) now GPL'd? Obviously not.
That probably depends on how large and how significant the bits you remember are. Otherwise one could take a person with photographic memory and circumvent all GPL licenses easily, by making that person type what they remember.
> Or if you simply read GPL code and learn something from it - or bits of the code are retained verbatim in your memory, are you (as a person) now GPL'd? Obviously not.
> If I take your hard work that you clearly marked with a GPL license and then make money from it, not quite directly, but very closely, how is that fair use? Or legal?
If I'm Google, and I scan your code and return a link to it when people ask to find code like that (but show an ad next to that link for someone else's code that might solve their problem too), that's fair use and legal. My search engine has probably stored your code in a partial format, and that's fine.
>If I take your hard work that you clearly marked with a GPL license and then make money from it, not quite directly, but very closely, how is that fair use? Or legal?
You can wipe your ass with the GPL license if your use of the product falls within Fair Use.
You can actually take snippets from commercial movies and post them onto YouTube if your YouTube video is transformative enough for your usage to be considered fair use. Well, theoretically at least - in reality YouTube might automatically copyright strike it.
>Copying and storing a book isn't recreating another book from it.
That doesn't mean that GitHub has to redistribute Copilot under GPL. However, the end user could potentially have to if they use Copilot to generate new code that happens to copy GPL code verbatim.
> You can wipe your ass with the GPL license if your use of the product falls within Fair Use.
Is Copilot fair use? It's reading code, generating other code (some verbatim) and making money from it all while not having to release its source code to the world?
> That doesn't mean that GitHub has to redistribute Copilot under GPL
I wasn't saying that was the case: some of the code that Copilot used may not allow redistribution under GPL.
But let's say that all of the code it scanned was GPL for the sake of argument. Why would they not have to distribute their Copilot source yet, if I use it to generate some code, I'd have to distribute mine?
> Is Copilot fair use? It's reading code, generating other code (some verbatim) and making money from it all while not having to release its source code to the world?
Again, fair use is an exception to copyright protection. If something is fair use, the license does not apply. The fact that Copilot does not release its source code is related only to a specific term of a specific license, which does not apply if Copilot is indeed fair use.
Not sure I see it that way.
If I take your hard work that you clearly marked with a GPL license and then make money from it, not quite directly, but very closely, how is that fair use? Or legal?
Copying and storing a book isn't recreating another book from it. Copilot is creating new stuff from the contents of the "books" in this case.
Edit: I misunderstood fair use as it turns out...