A Fork of Kubernetes: 6 Challenges
In a relatively short time, the open-source ecosystem has evolved from a set of niche projects with limited corporate backing into the de facto way to build software. Today, companies small and large adopt open-source software (OSS) to accelerate product development and innovation.
As the adoption of Kubernetes continues to grow and mature, a lot of consolidation in the market has followed. Kubernetes now spans a wide range of offerings—some of which aren't as free and flexible as others—with different restrictions and licensing in place. Navigating these solutions to pick the right one is often challenging because there is no true one size that fits all.
Fork of Kubernetes
A fork of Kubernetes is a version of the open-source project that is developed along a separate workstream from the main trunk. Forking occurs when a part of the development community (or a third party not related to the project) makes a copy of the upstream project and modifies it to start a completely independent line of development. There are many reasons for forking Kubernetes, including differences in opinion (both technical or personal), stagnated development of the upstream project, and a desire to create different functionality or to steer the project in a new direction.
To be considered a fork of Kubernetes, a project should have a new project name, a branch of the software, a parallel infrastructure, and a new developer community (disjoint with the original project). This can happen in both open-source and proprietary environments.
When a fork of Kubernetes in an open-source environment improves the original source code, other forks may take advantage of it. Because the code is freely available to use, the other forks can merge the code into their fork to better meet the needs of both the developers and end users.
Conversely, in the case of a fork of Kubernetes in a proprietary environment, a vendor or cloud provider (typically) will modify the source code to meet its specific needs, repackage the software, and offer it to customers as a proprietary distribution. Alternatively, it may modify the add-ons needed to run Kubernetes in production.
Key Challenges of Forking Kubernetes
Deploying and managing Kubernetes at scale within an enterprise can be a challenging endeavor. Many organizations turn to proprietary distributions to get enterprise support for their container platforms. As the Kubernetes ecosystem has evolved and matured, however, the pace of innovation within the upstream Kubernetes community has accelerated significantly.
The result is a deep fork between pure open-source Kubernetes and proprietary Kubernetes deployments. Proprietary distributions of Kubernetes face a number of challenges to keep up with the pace of innovation, running the following risks:
Delays in Getting Upstream Features and Bug Fixes
Every time you merge in changes to Kubernetes, it becomes more difficult to make them work with your custom distribution. This process is slow, error-prone, and costly. By the time a new feature comes out, your custom distribution is a few releases behind the latest. That's why many vendors that fork Kubernetes have an older version of the cluster API; otherwise, it can take a vendor six months or longer to get improvements and bug fixes from the upstream.
Lack of Flexibility
A proprietary fork in Kubernetes creates vendor lock-in—keeping the customer dependent on a vendor for products and services. Even if the source code is open-source, vendors may wrap their forked version of Kubernetes in many features that, while making it easier to manage, also make it harder to migrate to other platforms without incurring additional costs and excess resource allocation. And because most custom distributions lack version control through Flux, you can't switch vendors without re-architecting your whole stack. As such, a proprietary fork in Kubernetes does not give you the flexibility to move your applications and data seamlessly between public, private, and on-premises services. It also limits your options as your organization continues to grow.
Lack of Functionality
A forked version of Kubernetes runs the risk of breaking application functionality. Some custom distributions rely on proprietary APIs and CLIs to get full functionality—yet another way that customers get locked in to a single vendor. And if the custom distribution only runs on the vendor's custom Linux kernel, that too creates lock-in. As time goes on, it will be increasingly harder to maintain this fork. Merging the latest upstream patches into the fork won't be possible without major additional work for patch and feature compatibility.
Code Security Risks
A fork in Kubernetes can potentially run less secure code. If a vulnerability is found in open source code and fixed by the community in the upstream, a forked version of the code may not necessarily benefit from this fix.
Lack of Interoperability
Vendors may modify code for their custom distributions or the supporting applications you need to make Kubernetes run in production. While a modified version of Kubernetes will work with a particular vendor's application stack and management tools, these proprietary modifications lock you into versions or customized builds of components that prevent you from integrating with other upstream open-source projects without lock-in. And if their stack is comprised of multiple products, it's very hard to achieve interoperability. Forking and lack of interoperability will cause a lot of downstream issues as you scale.
Technical Debt
It's incredibly difficult to merge back a fork that has over the years taken on a life of its own and has diverged drastically from the upstream. This is known as technical debt, which refers to the cost of maintaining source code that was caused by a deviation from the main branch where joint development happens. The more changes made to forked code, the more it costs in time and money to rebase the fork to the upstream project.
Upstream Kubernetes
Upstream Kubernetes is an OSS version of Kubernetes that is hosted and maintained by the Cloud Native Computing Foundation (CNCF). It consists of core Kubernetes components (often referred to as "plain-vanilla Kubernetes") that are needed to orchestrate containers without any add-on applications. These core Kubernetes components are distributed with its source code, making it publicly accessible for inspection, modification, and redistribution.
When features and patches are accepted upstream, every project and product based on the upstream can benefit from that work when they pick up the future release or merge upstream patches. By doing work upstream first, you have the opportunity to share ideas with the larger community. From there, you can work together to try to get new features and releases accepted upstream.