My question is what GitHub is going to do when people start sending them DMCA takedown notices over their code being distributed through this system.
Currently, if you claim to be a copyright owner GitHub can respond to a DMCA takedown by removing the repository. This might require them to retrain the entire model.
One option for GitHub might be to maintain a blocklist of various code snippets, and if there is a substring match, just don't make the suggestion.
Currently, if you claim to be a copyright owner GitHub can respond to a DMCA takedown by removing the repository. This might require them to retrain the entire model.
One option for GitHub might be to maintain a blocklist of various code snippets, and if there is a substring match, just don't make the suggestion.