I've seen this idea or variants of it in a few different forums. As someone with zero legal background, I don't know if there's weight to asking GitHub why they didn't train on Microsoft's...
I've seen this idea or variants of it in a few different forums. As someone with zero legal background, I don't know if there's weight to asking GitHub why they didn't train on Microsoft's proprietary code...
It was probably just a pragmatic decision. Microsoft is a large company, and large companies are slow to act. GitHub would have had to work with the highest levels of the company. By using open...
It was probably just a pragmatic decision. Microsoft is a large company, and large companies are slow to act. GitHub would have had to work with the highest levels of the company. By using open source code, they can release a product much sooner. Perhaps they are planning on adding Microsoft code as training data later.
Not to mention that Microsoft code almost certainly does not have the breadth needed to train an AI. They likely would have had to use other open source code anyway.
Oh, for sure. Call me cynical, but I see GitHub "getting away" with this for sure. I wonder what precedents this will set for open source licenses etc.
Oh, for sure. Call me cynical, but I see GitHub "getting away" with this for sure. I wonder what precedents this will set for open source licenses etc.
At least we see the real reason behind the purchasing of Github. They purchased GitHub to use as their private code training data. They purchased exclusive access to OpenAI's GPT-3 source, likely...
At least we see the real reason behind the purchasing of Github. They purchased GitHub to use as their private code training data. They purchased exclusive access to OpenAI's GPT-3 source, likely others going forward as well.
I don't know if I've ever seen a more blatant example of "If you're not paying, you're the product, not the consumer."
Microsoft only loves open source insofar as they can exploit it. Add this to my list of "Things Microsoft must release under the GPL before I think they've changed for the better."
Accidentally posted to ~tech first. Fixed. I'm curious to see how this plays out on the licenses front. I know several people have brought it up already, and I'm a FOSS guy, but a part of me also...
I'm curious to see how this plays out on the licenses front. I know several people have brought it up already, and I'm a FOSS guy, but a part of me also recognizes that MSFT / GitHub have a huge legal team that thinks this is likely to be okay...
github copilot has, by their own admission, been trained on mountains of gpl code, so i'm unclear on how it's not a form of laundering open source code into commercial works. the handwave of "it usually doesn't reproduce exact chunks" is not very satisfying
copyright does not only cover copying and pasting; it covers derivative works. github copilot was trained on open source code and the sum total of everything it knows was drawn from that code. there is no possible interpretation of "derivative" that does not include this
i'm really tired of the tech industry treating neural networks like magic black boxes that spit out something completely novel, and taking free software for granted while paying out $150k salaries for writing ad delivery systems. the two have finally fused and it sucks
previous """AI""" generation has been trained on public text and photos, which are harder to make copyright claims on, but this is drawn from large bodies of work with very explicit court-tested licenses, so i look forward to the inevitable /massive/ class action suits over this
"but eevee, humans also learn by reading open source code, so isn't that the same thing"
- no
- humans are capable of abstract understanding and have a breadth of other knowledge to draw from
- statistical models do not
- you have fallen for marketing
Microsoft's lawyers no doubt did their due diligence, but the law isn't really black & white, especially regarding IP and software licenses. Just because some corporate lawyers gave the OK doesn't...
Microsoft's lawyers no doubt did their due diligence, but the law isn't really black & white, especially regarding IP and software licenses. Just because some corporate lawyers gave the OK doesn't mean that lawyers willing to represent the class, or a judge, will agree with them.
I would be curious to know what kind of legal precedent that would end up setting. It would be very nice (or terrible) to be able to easily nullify laws by having a sufficiently large community...
It's guaranteed any corporate law team worth their salt can prove incontrovertibly that code copying is both rampant an accepted in the community.
I would be curious to know what kind of legal precedent that would end up setting. It would be very nice (or terrible) to be able to easily nullify laws by having a sufficiently large community violate them.
I don’t know very much about it, but I recall a bit of jargon about “transformative use” and I think I’d want to understand whether it applies. Anyone actually curious about the law might want to...
I don’t know very much about it, but I recall a bit of jargon about “transformative use” and I think I’d want to understand whether it applies. Anyone actually curious about the law might want to look into opinions from legal experts.
But it doesn’t seem particularly likely that cautious corporate lawyers would want their employees using this tool, a least until it’s more well understood. It’s easier to say “no” and stay out of trouble. So those lawsuits might not happen?
This could probably be useful for very common tasks (e.g. validate a form) but I'm a bit skeptical it would be if you work in a more specialized domain (e.g. scientific computing) where there's...
This could probably be useful for very common tasks (e.g. validate a form) but I'm a bit skeptical it would be if you work in a more specialized domain (e.g. scientific computing) where there's not many examples of what you want to do to draw from.
I've seen this idea or variants of it in a few different forums. As someone with zero legal background, I don't know if there's weight to asking GitHub why they didn't train on Microsoft's proprietary code...
It was probably just a pragmatic decision. Microsoft is a large company, and large companies are slow to act. GitHub would have had to work with the highest levels of the company. By using open source code, they can release a product much sooner. Perhaps they are planning on adding Microsoft code as training data later.
Not to mention that Microsoft code almost certainly does not have the breadth needed to train an AI. They likely would have had to use other open source code anyway.
Oh, for sure. Call me cynical, but I see GitHub "getting away" with this for sure. I wonder what precedents this will set for open source licenses etc.
At least we see the real reason behind the purchasing of Github. They purchased GitHub to use as their private code training data. They purchased exclusive access to OpenAI's GPT-3 source, likely others going forward as well.
I don't know if I've ever seen a more blatant example of "If you're not paying, you're the product, not the consumer."
Microsoft only loves open source insofar as they can exploit it. Add this to my list of "Things Microsoft must release under the GPL before I think they've changed for the better."
To play devil's advocate, they totally didn't need to buy GitHub to have access to the open-source training data.
Accidentally posted to ~tech first. Fixed.
I'm curious to see how this plays out on the licenses front. I know several people have brought it up already, and I'm a FOSS guy, but a part of me also recognizes that MSFT / GitHub have a huge legal team that thinks this is likely to be okay...
Speaking of, here's Eevee's tweets about it:
Microsoft's lawyers no doubt did their due diligence, but the law isn't really black & white, especially regarding IP and software licenses. Just because some corporate lawyers gave the OK doesn't mean that lawyers willing to represent the class, or a judge, will agree with them.
I would be curious to know what kind of legal precedent that would end up setting. It would be very nice (or terrible) to be able to easily nullify laws by having a sufficiently large community violate them.
I don’t know very much about it, but I recall a bit of jargon about “transformative use” and I think I’d want to understand whether it applies. Anyone actually curious about the law might want to look into opinions from legal experts.
But it doesn’t seem particularly likely that cautious corporate lawyers would want their employees using this tool, a least until it’s more well understood. It’s easier to say “no” and stay out of trouble. So those lawsuits might not happen?
This could probably be useful for very common tasks (e.g. validate a form) but I'm a bit skeptical it would be if you work in a more specialized domain (e.g. scientific computing) where there's not many examples of what you want to do to draw from.