Linux forks

This isn’t a reference to open source cutlery.

Linux forks, a term often applied to the varieties of Linux distributions, are poorly understood even in the Linux community. Are all the various distributions available on the internet true forks of the kernal? Kevin Morgan argues that they are not. Is his Enterprise OpenSource Magazine (formerly LinuxWorld Magazine) article “Linux Technology Leadership and the Forking Issue”, he explores the concept of forking in detail, explains the advantages of having unique variants for distinct applications and explains why variants are not forks.

We should remind our readers that Linux, or GNU/Linux, refers only to the kernal. It forms the core of all distributions. While there are versions of the kernal, there aren’t variations of it. As Kevin Morgan mentions in his introduction,

Linux is formally a trademark, owned by Linus Torvalds. The trademark is associated with a body of source code hosted at the www.kernel.org/ Web site

So while there is only one kernal, there are hundreds of distributions built around that kernal. Are these variations true forks of the kernal?

First of all, let’s define a fork of Linux.

In its most distilled form, a fork is any version of source code that is different (modified) from the master copy. By this definition, every time anyone, anywhere, makes a copy of Linux and begins to make modifications, they’re creating a fork. In this sense, Linux is forking literally thousands of times a day.

Pragmatically however, this is not the common interpretation of the word ‘fork’ as it applies to software in general and to operating system software in particular. A fork is generally construed as:

* A long-lived derivative of the master version of source code, and
* A derivative that’s not intended to ‘resynchronize’ with the master source code, and
* A derivative of some significance in the marketplace.

Hence, short-lived derivatives that precede the submission of the changes back to kernel.org aren’t forks. Commercial companies embedding Linux inside a product that have developed a derived version with code corrections and/or enhancements that the company doesn’t submit back to kernel.org is not a fork. Such a derivative version doesn’t have significance to the broader operating systems marketplace; it’s a private version of no broader significance.

Here’s a scenario of a real fork;

Linux is developed with a first-order focus on server and desktop computing. Requirements from more specialized areas of computing aren’t given the highest levels of priority. Such areas include consumer products, telecommunications equipment, mobile handsets, and mil-aero. When implementation or optimization choices conflict with requirements from different application segments, servers and desktops typically (and arguably, appropriately) win out.

Given this backdrop, it’s possible that an individual or a group (for example, an industry consortium) could build, market, and maintain a separate, specialized version of Linux. Let’s say, for example, several large mil-aero firms decided to cooperate in developing and maintaining a “mil-aero Linux.” To do this they’d choose a starting point set of Linux code from kernel.org and publish this code independent of kernel.org (say, www.mi-aero-linux.org). They’d then begin the process of evolving this code base to meet better the requirements of the mil-aero equipment market. The stated intention would be to make long-term design, content, and management decisions to optimize the code for those requirements. In doing so, the decision would be to diverge henceforth from the kernel.org source base due to the very different design objectives and specific code evolution being planned.

If such a development occurred, and the project moved forward successfully and the resulting software gathered significant adoption and use…THAT would be a fork of the Linux kernel.

So are the various Linux distributions forks or not?

To date, no true Linux fork has arisen. So with respect to the Linux variants that aren’t forks of the Linux kernel, are there legitimate concerns about them having “differences” from the kernel.org source code?

The answer is both yes and no. The first area of concern is “compatibility.” By definition, if there are new functions (via new or different APIs) in a Linux variant then use of those functions creates “incompatible” software relative to the current kernel.org version of Linux. Before concluding that this is a concern, however, we have to answer a couple of fundamental questions: Why did a commercial vendor add this feature or API to the system? And why is a commercial customer of that product considering using it?

The answers should be obvious by now: the features and API were added so Linux could be used in an application where it otherwise it couldn’t be used. And the only legitimate reason for a customer of such a product to use that extended API would be because its requirements can’t be serviced without it. So while it’s true that the resulting application software won’t be compatible with the Linux available from kernel.org, consider the alternative: a product development team (and possibly an entire application or market segment) denied the benefits of Linux because it won’t meet their requirements.

A second concern is vendor lock-in: Does the inclusion of extensions and their use by customers re-create a proprietary lock-in situation similar to the “bad old days” of proprietary operating systems? The answer is definitely no.

The reason is that the source code for all the API extensions is included in the distribution; the customer that requires the APIs and uses them has the full implementation of those APIs and can look to other alternatives for maintenance and support of that code in the future if the current vendor fails to satisfy.

A second reason is that (for MontaVista’s kernel technology and in general for all extensions of market significance) the extensions are independently available as Open Source technology. Typically, this means that the enabling source code is available as patches to the standard kernel.org kernel. So not only is there no lock-in, the switching costs can be very low from the perspective of getting the required APIs integrated into new or different versions of Linux.

Many of the extensions for a particular market segment are becoming de facto standards and are available from multiple commercial vendors. Examples include XFS for streaming media applications, OpenAIS and OpenIPMI for telecommunications equipment, and dynamic power management for mobile handsets.

Morgan concludes;

The concerns about commercial vendors selling products that “fork the Linux kernel” are not just overblown; analysis shows them to be a red herring. All significant Linux distributions are Linux variants, and none of them are or have become Linux forks. We have shown how even basic commercial support activities require an independently maintained copy (a “variant”) of the Linux kernel. Those value-add components (features, internal “-ability” enhancements, and quality improvements) are frequently the critical factor in the purchasing decision and without those values a Linux-based product may not be feasible. So if and when a vendor comes calling, describing competitors’ products as “a fork of Linux,” ask these hard questions:

* Is YOUR Linux distribution a 100% copy of kernel.org without any changes in source code whatsoever?
* If so, why should I buy it from you?
* If so, how will you maintain it and support me? How can you provide me with integrated defect corrections, without changing the code?
* If code changes to fix defects must first be submitted, accepted, and published by kernel.org, what am I to do in the meantime? That process can be lengthy.
* If your Linux distribution is not a 100% copy of kernel.org, why is it NOT a fork of the Linux kernel and yet the competitor’s product IS a fork?

These questions will quickly reveal the truth: any viable Linux distributor, by the very nature of the product and services it is offering, MUST offer a Linux variant. The issue isn’t who is or isn’t “forking Linux.” In fact, it can be argued that any vendor who suggests that a competitor’s product is a fork of the Linux kernel has such a low understanding of Linux and system software in general that he should be thrown out immediately in any serious vendor evaluation. These kinds of statements indicate some combination of deep ignorance or deep misinformation under the assumption that the customer is deeply ignorant. Either is a legitimate reason to keep searching for vendors of quality.