23 votes

GNU and the AI reimplementations

Posted March 8 by skybrian

Tags: law, copyright, artificial intelligence, open source, history, development.software, language models.large, vibecoding, source.antirez

https://antirez.com/news/162

3 comments

skybrian (OP)
March 8
Link
From the article: [...] [...] [...]

From the article:

Stallman [...] was well versed in the copyright nuances. He asked the other programmers to reimplement the UNIX userspace in a specific way. A way that would make each tool unique, recognizable, compared to the original copy. Either faster, or more feature rich, or scriptable; qualities that would serve two different goals: to make GNU Hurd better and, at the same time, to provide a protective layer against litigations. If somebody would claim that the GNU implementations were not limited to copying ideas and behaviours (which is legal), but “protected expressions” (that is, the source code verbatim), the added features and the deliberate push towards certain design directions would provide a counter argument that judges could understand.

He also asked to always reimplement the behavior itself, avoiding watching the actual implementation, using specifications and the real world mechanic of the tool, as tested manually by executing it. Still, it is fair to guess that many of the people working at the GNU project likely were exposed or had access to the UNIX source code.

When Linus reimplemented UNIX, writing the Linux kernel, the situation was somewhat more complicated, with an additional layer of indirection. He was exposed to UNIX just as a user, but, apparently, had no access to the source code of UNIX. On the other hand, he was massively exposed to the Minix source code (an implementation of UNIX, but using a microkernel), and to the book describing such implementation as well. But, in turn, when Tanenbaum wrote Minix, he did so after being massively exposed to the UNIX source code. So, SCO (during the IBM litigation) had a hard time trying to claim that Linux contained any protected expressions. Yet, when Linus used Minix as an inspiration, not only was he very familiar with something (Minix) implemented with knowledge of the UNIX code, but (more interestingly) the license of Minix was restrictive, it became open source only in 2000. Still, even in such a setup, Tanenbaum protested about the architecture (in the famous exchange), not about copyright infringement. So, we could reasonably assume Tanenbaum considered rewrites fair, even if Linus was exposed to Minix (and having himself followed a similar process when writing Minix).

[...]

So, reimplementations were always possible. What changes, now, is the fact they are brutally faster and cheaper to accomplish. In the past, you had to hire developers, or to be enthusiastic and passionate enough to create a reimplementation yourself, because of business aspirations or because you wanted to share it with the world at large.

[...]

One thing that allowed software to evolve much faster than most other human fields is the fact the discipline is less anchored to patents and protections (and this, in turn, is likely as it is because of a sharing culture around the software). If the copyright law were more stringent, we could likely not have what we have today. Is the protection of single individuals' interests and companies more important than the general evolution of human culture? I don’t think so, and, besides, the copyright law is a common playfield: the rules are the same for all. Moreover, it is not a stretch to say that despite a more relaxed approach, software remains one of the fields where it is simpler to make money; it does not look like the business side was impacted by the ability to reimplement things. Probably, the contrary is true: think of how many businesses were made possible by an open source software stack (not that OSS is mostly made of copies, but it definitely inherited many ideas about past systems). I believe, even with AI, those fundamental tensions remain all valid. Reimplementations are cheap to make, but this is the new playfield for all of us, and just reimplementing things in an automated fashion, without putting something novel inside, in terms of ideas, engineering, functionalities, will have modest value in the long run. What will matter is the exact way you create something: Is it well designed, interesting to use, supported, somewhat novel, fast, documented and useful? Moreover, this time the inbalance of force is in the right direction: big corporations always had the ability to spend obscene amounts of money in order to copy systems, provide them in a way that is irresistible for users (free, for many years, for instance, to later switch model) and position themselves as leaders of ideas they didn’t really invent. Now, small groups of individuals can do the same to big companies' software systems: they can compete on ideas now that a synthetic workforce is cheaper for many.

[...]

There is another fundamental idea that we all need to internalize. Software is created and evolved as an incremental continuous process, where each new innovation is building on what somebody else invented before us. We are all very quick to build something and believe we “own” it, which is correct, if we stop at the exact code we wrote. But we build things on top of work and ideas already done, and given that the current development of IT is due to the fundamental paradigm that makes ideas and behaviors not covered by copyright, we need to accept that reimplementations are a fair process. If they don’t contain any novelty, maybe they are a lazy effort? That’s possible, yet: they are fair, and nobody is violating anything. Yet, if we want to be good citizens of the ecosystem, we should try, when replicating some work, to also evolve it, invent something new: to specialize the implementation for a lower memory footprint, or to make it more useful in certain contexts, or less buggy: the Stallman way.

7 votes
TonesTones
March 8
Link
Thank you for posting this article; I feel like it put words to the underlying feeling I’ve had with all these disputes around AI and the legally dubious theft of existing work. I’m unconvinced...

Thank you for posting this article; I feel like it put words to the underlying feeling I’ve had with all these disputes around AI and the legally dubious theft of existing work. I’m unconvinced that copyright law will be an effective bulwark against “bad actors” going after open-source code (or art, music, or language, for that matter). In the courtroom, having the pockets to afford good lawyers generally means more than anything else.

It’s clear that the open source community need to reckon with the technology, as it is, and figure out if it can be utilized to do good. Copyright protections will likely not hold for very long.

7 votes
bme
March 9
Link
I don't like anything about what is going on with AI, but I do think either it's as good as everyone says, which means we can use it to improve user freedom by making better software more cheaply...

I don't like anything about what is going on with AI, but I do think either it's as good as everyone says, which means we can use it to improve user freedom by making better software more cheaply in the open, or it isn't, in which case all the usual economics hold and these forks will fail the way forks fail today.

Obviously there will be a spectrum, and humanity will probably find a way to thread the needle to the worst of every world compromise (AI removes software engineering as a dedicated profession because the market dies for long enough as software quality declines even further under the weight of so much AI driven vibing, and the knowledge of how to better is gone in a generation or two).

1 vote