The Convergence of HPC, AI and Cloud

For optimal reading, please switch to desktop mode.

For StackHPC's clients, like so many others, Kubernetes is a key workload for OpenStack open infrastructure. Linux, OpenStack and Kubernetes Infrastructure (LOKI) combine to create an open infrastructure software stack that dominates the cloud-native landscape.

Among OpenStack users the Magnum project is one of the most popular methods for deploying it.

Innovation in the Magnum Project

Kubernetes is a fast-developing ecosystem, and Kubernetes clusters are deployed far and wide. OpenStack hosts only a small proportion of the Kubernetes installed base.

The Cluster API project, focuses on the interface between Kubernetes clusters and various types of cloud infrastructure (including OpenStack) upon which the cluster is deployed. The wide range of infrastructures that Cluster API supports has given it a vibrant developer community and helps to make Cluster API a de facto standard for Kubernetes deployment and operations on cloud infra.

Bringing Cluster API's advantages to OpenStack infrastructure (and to StackHPC's clients) presented a compelling opportunity.

This is a story about the Four Opens.

20 October 2021: The concept of a Cluster API driver is introduced in the Yoga cycle Magnum Project Teams Gathering (PTG).
26 October 2021: An empty proof-of-concept Cluster API driver is submitted by John Garbutt to the Magnum codebase. A lot of community interest is generated. Work proceeds (albeit slowly) through 2022.
12 January 2022: To follow the open design process, a spec is drafted and uploaded for review. The community reviews it, but we all have busy lives and in some cases months pass between reviews and responses. It takes a year to get approved.
19 October 2022: Further discussion on the driver's (slow) progress at the Antelope PTG.
12 December 2022: The team at VexxHost announce an independently-developed Cluster API driver, release as a third-party out-of-tree Magnum driver.
6 January 2023: The team from StackHPC combines with the team from VexxHost to submit a joint presentation for the Open Infra summit in Vancouver - and it is accepted!
14 June 2023: The presentation is delivered at the Vancouver summit by Matt Pryor from StackHPC and Mohammed Naser from Vexxhost. The presentation covers some implementation differences and similarities between the two drivers. Still two drivers, being developed largely as separate efforts - could they be reconciled?

October 2023: Further discussion among the Magnum team at the Caracal PTG. Discussion at this phase is about how to reconcile two alternative Cluster API driver implemmentations, and how to avoid conflict between them.
January 2024: After much deliberation and effort at reducing driver conflict within the Magnum project, spanning several PTG sessions, it is decided to merge the in-tree driver whose development was led by StackHPC into the Magnum codebase. This will happen in the Dalmatian release cycle, landing in Q4 2024.

Anatomy of a Cluster API driver

Historically, Magnum has deployed Kubernetes clusters in a bespoke manner, with cluster infrastructure provisioned using OpenStack Heat and then configured using custom scripts. Over time this home-brewed approach has become a maintenance burden for the open source developers of Magnum, and as a result Magnum can be slow to support new Kubernetes versions.

Cluster API, on the other hand, is an active Kubernetes project supported by the Cluster Lifecycle SIG with a significant and broad community. Cluster API comprises a set of Kubernetes operators that provide declarative APIs for provisioning, upgrading and operating Kubernetes clusters on a variety of infrastructures, of which OpenStack is one. This is accomplished by having a core API for defining clusters that calls out to infrastructure providers to provision cloud infrastructure such as networks, routers, load balancers and machines. Once provisioned by the infrastructure provider, the machines themselves are turned into a Kubernetes cluster using kubeadm, the standard upstream tool for creating Kubernetes clusters. Auto-healing and auto-scaling are also supported using infrastructure-agnostic code that is maintained as part of the upstream project.

StackHPC, with contributions from Catalyst Cloud, have created a Magnum driver that provisions Kubernetes clusters by creating and updating Cluster API resources as an alternative to the standard Heat-based driver. Using Cluster API via Magnum in this way has two major advantages:

Magnum becomes a wrapper around deployment tools that are officially supported and maintained upstream. This reduces the amount of Magnum-specific code that needs to be maintained and allows us to easily support new Kubernetes versions as they become available.

Users can benefit from the features of Cluster API without having to change their existing Magnum workflows - for instance, the OpenStack CLI and Magnum Terraform modules will continue to work with the new driver.

Azimuth's Cluster API Engine

The new Cluster API driver reuses components developed to provide Kubernetes support in Azimuth, an open-source portal providing self-service HPC and AI platforms for OpenStack clouds, where they have been battle-tested for a number of years.

Providing a managed Kubernetes service is much more than just provisioning Kubernetes cluster nodes - clusters must have addons installed, such as a CNI, OpenStack integrations and the metrics server.

Whilst Cluster API is excellent at provisioning and operating Kubernetes clusters, it has limited capability for managing cluster addons. To address this, StackHPC developed an addon provider for Cluster API - another Kubernetes operator that provides a declarative API for specifying which Helm charts and additional manifests should be installated on a Cluster API cluster.

For example, this resource specifies that the NGINX ingress controller should be installed onto the Cluster API cluster example using values from an inline template (values from ConfigMaps and Secrets are also supported):

apiVersion: addons.stackhpc.com/v1alpha1
kind: HelmRelease
metadata:
  name: ingress-nginx
spec:
  # The name of the target Cluster API cluster
  clusterName: example
  # The namespace on the cluster to install the Helm release in
  targetNamespace: ingress-nginx
  # The name of the Helm release on the target cluster
  releaseName: ingress-nginx
  # Details of the Helm chart to use
  chart:
    repo: https://kubernetes.github.io/ingress-nginx
    name: ingress-nginx
    version: 4.9.1
  # Values for the Helm release
  # These can be merged from multiple sources
  valuesSources:
    - template: |
        controller:
          metrics:
            enabled: true
            serviceMonitor:
              enabled: true

As well as providing managed Kubernetes clusters for tenants using Cluster API, Azimuth itself also runs on a Kubernetes cluster. Usually this cluster is deployed in a project on the same OpenStack cloud that Azimuth is configured to provision resources in.

In order to share as much code and knowledge between these two use cases as possible, StackHPC have codified the deployment of Cluster API clusters on OpenStack, with addons properly configured, as a set of open-source Helm charts. These charts are used in Azimuth for provisioning both the Azimuth cluster itself and tenant clusters.

Helm's Flexibility Advantage

The biggest benefit of encapsulating good practice using Helm is that the charts can be reused outside of Azimuth, e.g. to operate Kubernetes clusters using GitOps with Flux or Argo CD. For example, deploying a basic cluster with a monitoring stack requires the following Helm values:

# The name of a secret containing an OpenStack appcred for the target project
cloudCredentialsSecretName: my-appcred

# The ID of the image to use for cluster nodes
machineImageId: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
# The version of kubelet that is in that image
kubernetesVersion: 1.28.6

clusterNetworking:
  # The ID of the external network to use
  externalNetworkId: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx

controlPlane:
  # The name of the flavor to use for control plane nodes
  machineFlavor: small

# The node groups for the cluster
nodeGroups:
  - name: md-0
    machineFlavor: small
    machineCount: 2

# Enable the monitoring stack
addons:
  monitoring:
    enabled: true

In particular, these charts are used by StackHPC's Magnum driver to produce the Cluster API resources for Magnum clusters. The Magnum driver simply converts Magnum's representation of the cluster template and cluster into Helm values that are passed to the CAPI Helm charts.

This means that although StackHPC's Cluster API Magnum driver is relatively new, it is based on components that have been used in production for several years.

StackHPC will be contributing the Helm charts to the Magnum project to serve as a reference for how to provision Cluster API clusters on OpenStack.

Magnum Cluster API Drivers are Like Buses

The Kolla-Ansible project already does the work to package the driver developed by VEXXHOST and will also be packaging the Cluster API Helm driver once it merges in the Magnum codebase.

In the Bobcat relase of Magnum, it is not possible for both of these drivers to coexist as the current driver selection mechanism does not consider enough information to differentiate them (see the Magnum Docs). Jake Yip, the current Magnum PTL, has been doing some great work to make the driver selection more explicit using image metadata, which will allow the two Cluster API drivers to co-exist from the Caracal release onwards (patch in Gerrit).

There are some technical differences between the VEXXHOST and StackHPC drivers. In particular, as discussed above, StackHPC have chosen to use Helm to template the Cluster API resources for Magnum clusters, whereas the VEXXHOST driver templates the Cluster API resources in Python code.

We chose Helm for a number of reasons:

The method to deploy Cluster API clusters on OpenStack is reusable outside of Magnum, rather than being encapsulated behind the Magnum APIs.

The Helm charts, when combined with StackHPC's addon provider, provide a powerful, declarative way for managing addons.

More flexibility for operators - if an operator needs to modify the way that clusters are deployed, the default charts can be replaced with custom charts as long as the Helm values provided by the Magnum driver are respected.

The Helm charts can have an independent release cycle, allowing Magnum to support new Kubernetes versions without waiting for an OpenStack release by just creating new cluster templates that reference the new Helm chart version.

A significant difference between the two drivers is use of ClusterClass. Whilst ClusterClass does promise to make certain lifecycle operations easier, particularly upgrades, the Cluster API project still classifies the feature as "experimental (alpha)" and requires a feature gate to be enabled to use it.

The CAPI Helm charts have intelligence baked in to deal with upgrades, backed by extensive use in production and cluster upgrades tested in CI. We are being cautious about switching to an alpha feature that may still be subject to substantial change. It is expected that the Helm charts will move to ClusterClass when it moves into beta or GA.

Despite the technical differences, there are still a number of places where code could be shared between the two drivers. Both drivers derive the Magnum cluster state by looking at the same Cluster API objects, both drivers have a need to make the certificates generated by Magnum available to the Cluster API clusters, and both drivers monitor the cluster health in the same way.

At StackHPC, we are keen to collaborate to see if we can develop a shared library that can be used by both drivers to perform these tasks, reducing duplication and making the most of the available effort to push the Magnum ecosystem forward together.

Looking Ahead

As part of StackHPC's activities of support and sustaining, our clients using Magnum are getting early access to the driver throughout 2024.

While the driver merge progresses through the upstream release cycle, StackHPC will continue to work on improving the user documentation, CI test coverage and automation around deployment and operations.

Get in touch

If you would like to get in touch we would love to hear from you. Reach out to us via Twitter, LinkedIn or directly via our contact page.