I see a lot of people trying to compare its "machine learning" to human learning.
Let's use this thought experiment: Imagine that Github's Copilot was just a massive array of all the lines of code from every github project, with some (magical automated whatever) tagging and indexing on each function, and a search engine on top of that.
Now imagine that copilot simply finds the closest search result, and then when you press a button, it inserts the line from the array, and press it again and you get the next line, etc.
Now hopefully nobody here thinks such a system would fulfil either the spirit or the law of any half-restrictive license. Yet that is a perfectly valid implementation of Copilot's aim - and it sounds like it's not that far from what actually happens, maybe with a bit of variable name munging.
So my question is this: If you could build a line between the system I describe above and the system of human learning, where a human learns the patterns and can genuinely produce novel structures and patterns and even programming languages that it has never seen before.
At what point along that line would you say that Copilot is close enough to human to not be violating licenses that require attribution?
I don't think it matters where Copilot is on that line. A skilled human programmer at the far end of that line, fully capable of producing novel programs that they haven't seen before, would still be violating copyright if they reproduced a program they have seen before.
Let's use this thought experiment: Imagine that Github's Copilot was just a massive array of all the lines of code from every github project, with some (magical automated whatever) tagging and indexing on each function, and a search engine on top of that.
Now imagine that copilot simply finds the closest search result, and then when you press a button, it inserts the line from the array, and press it again and you get the next line, etc.
Now hopefully nobody here thinks such a system would fulfil either the spirit or the law of any half-restrictive license. Yet that is a perfectly valid implementation of Copilot's aim - and it sounds like it's not that far from what actually happens, maybe with a bit of variable name munging.
So my question is this: If you could build a line between the system I describe above and the system of human learning, where a human learns the patterns and can genuinely produce novel structures and patterns and even programming languages that it has never seen before.
At what point along that line would you say that Copilot is close enough to human to not be violating licenses that require attribution?